Graphql: accessing GitHub issue more then 100

Hi all,

I have developed in past an utils for maintaining opensource projects especially issues in GitHub.

As current limitation, the CLI can’t access more then 100 Issue. (https://github.com/MalloZup/blacktango)

My question would be, do you know/ or have experience with the way for accessing all the issue in a project using Graphql?

Like if i would have 5000 issues, how would this work with Graphql GitHub

See my current query https://github.com/MalloZup/blacktango/blob/master/lib/github/graphql.ex#L9

I’d imagine it’s a hard limit by GitHub to limit execution time of queries. If you need more use pagination to query the rest in addition requests.

2 Likes

The API supports pagination.

See https://developer.github.com/v4/object/repository/

Scroll down to “issues”.

4 Likes

The problem with the GitHub GraphQL API is, that your rate limit is very low and they got pagination for everything. I once tried to fetch all issues/pull requests with all its comments and actions to analyze the data and give the repository a “rating” based on maintainer reaction times. It’s impossible if you want to stay inside the rate limit and the complexity of the queries is also huge since you would have to deal with multiple paginations at once if you want to use the least amount of queries.

1 Like

Thx! If you have maybe some examples in Github would be really appreciated… :cupid:. E.g I’m pretty new to the graphql syntax and also their doc is kind complex​:grin:

Man are you lucky today. The GitHub GraphQL Explorer had one of my test queries from 2017 stored in my localstorage.

{
  repository (owner: "bolt", name: "bolt") {
    name
    issues (last: 5, before: "Y3Vyc29yOjIxMDI0NDc1NA==") {
      totalCount
      edges {cursor}
      nodes {
        number
        title
        comments (first: 5) {
          edges {cursor}
          nodes {
            body
          }
        }
      }
    }
  }
}

the edges {cursor} is important because they give you the IDs for the before and after params.
As you can see in the query, I get some issues and then its comments. But the comments are also limited to a specific number so you have pagination there too.
(and comments have reactions (paginated) and issues have actions (opened, closed, etc. ) (paginated too)). It really is a mess if you want to grab everything :smiley:

3 Likes

THx! I was hoping that there was a more clever way to have all the issue in a chunk instead of iterating over 100 and keep the Cursor id updating over them. :sob:

1 Like

Yeah no, GitHub doesn’t really like it if you want to grab all the data. And since the rate limit is pretty low on the GraphQL API, you might even hit it while fetching all issues on a repository.

1 Like

:sob: the V3 API in that sense for GH was imho at least better, for this use-case I never had issues like this.

1 Like

You mean the REST API? Yeah, it is definitely easier to use but it still has pagination on everything. So while having a bigger rate limit, you also need to make more requests to the API to get the data you need.

1 Like