What method can I use to create a function that counts duplicates and prints them out

I have a huge JSON file and what I want to accomplish is tell if there are any dupes in this json and which ones. This is basically to check if the JSON responses from a log are having duplicates or no hence verify if what the API is returning looks good. I have read about Enum.uniq but I am not sure if that is something that would do my case. I would appreciate any hints or suggestions.

Thanks in advance

Your question is a bit open-ended. It will help if you post some small JSON samples demonstrating the (un-)desirable cases.

Furthermore, do you strictly need this in Elixir? Tools like jq and gron are 99% likely to do the job.

I have seen there are some tools out there but I don’t want to make use of those as from one I tried there certain limitations. I am not sure if those you mention have similar limitations. Does it mean it’s hard to do it in Elixir?

I am only including a small part as it is huge and I don’t know if I can paste all that stuff here since I am not allowed to upload anyway:

{
  "deals": [
    {
      "hash": "497281f9",
      "owner": "1",
      "contact": "8",
      "organization": null,
      "group": "1",
      "stage": "2",
      "title": "Bono",
      "description": "",
      "percent": "0",
      "cdate": "2023-12-18T15:04:54-06:00",
      "mdate": "2023-12-18T15:04:54-06:00",
      "nextdate": null,
      "nexttaskid": null,
      "value": "9999900",
      "currency": "usd",
      "winProbability": null,
      "winProbabilityMdate": "2023-12-18T15:04:54-06:00",
      "status": "0",
      "activitycount": "2",
      "nextdealid": "569",
      "edate": "2023-12-18 15:11:06",
      "links": {
        "dealActivities": "https://avucon71549.api-us1.com/api/3/deals/6/dealActivities",
        "contact": "https://avucon71549.api-us1.com/api/3/deals/6/contact",
        "contactDeals": "https://avucon71549.api-us1.com/api/3/deals/6/contactDeals",
        "group": "https://avucon71549.api-us1.com/api/3/deals/6/group",
        "nextTask": "https://avucon71549.api-us1.com/api/3/deals/6/nextTask",
        "notes": "https://avucon71549.api-us1.com/api/3/deals/6/notes",
        "account": "https://avucon71549.api-us1.com/api/3/deals/6/account",
        "customerAccount": "https://avucon71549.api-us1.com/api/3/deals/6/customerAccount",
        "organization": "https://avucon71549.api-us1.com/api/3/deals/6/organization",
        "owner": "https://avucon71549.api-us1.com/api/3/deals/6/owner",
        "scoreValues": "https://avucon71549.api-us1.com/api/3/deals/6/scoreValues",
        "stage": "https://avucon71549.api-us1.com/api/3/deals/6/stage",
        "tasks": "https://avucon71549.api-us1.com/api/3/deals/6/tasks",
        "dealCustomFieldData": "https://avucon71549.api-us1.com/api/3/deals/6/dealCustomFieldData"
      },
      "id": "6",
      "isDisabled": false,
      "account": null,
      "customerAccount": null
    },
    {
      "hash": "8efe4b41",
      "owner": "1",
      "contact": "7",
      "organization": "2",
      "group": "1",
      "stage": "3",
      "title": "Leke",
      "description": "",
      "percent": "0",
      "cdate": "2023-12-18T15:04:01-06:00",
      "mdate": "2023-12-18T15:11:00-06:00",
      "nextdate": null,
      "nexttaskid": "0",
      "value": "1000000",
      "currency": "usd",
      "winProbability": null,
      "winProbabilityMdate": "2023-12-18T15:04:01-06:00",
      "status": "0",
      "activitycount": "2",
      "nextdealid": "6",
      "edate": "2023-12-18 15:10:53",
      "links": {
        "dealActivities": "https://avucon71549.api-us1.com/api/3/deals/5/dealActivities",
        "contact": "https://avucon71549.api-us1.com/api/3/deals/5/contact",
        "contactDeals": "https://avucon71549.api-us1.com/api/3/deals/5/contactDeals",
        "group": "https://avucon71549.api-us1.com/api/3/deals/5/group",
        "nextTask": "https://avucon71549.api-us1.com/api/3/deals/5/nextTask",
        "notes": "https://avucon71549.api-us1.com/api/3/deals/5/notes",
        "account": "https://avucon71549.api-us1.com/api/3/deals/5/account",
        "customerAccount": "https://avucon71549.api-us1.com/api/3/deals/5/customerAccount",
        "organization": "https://avucon71549.api-us1.com/api/3/deals/5/organization",
        "owner": "https://avucon71549.api-us1.com/api/3/deals/5/owner",
        "scoreValues": "https://avucon71549.api-us1.com/api/3/deals/5/scoreValues",
        "stage": "https://avucon71549.api-us1.com/api/3/deals/5/stage",
        "tasks": "https://avucon71549.api-us1.com/api/3/deals/5/tasks",
        "dealCustomFieldData": "https://avucon71549.api-us1.com/api/3/deals/5/dealCustomFieldData"
 },
      "id": "556",
      "isDisabled": false,
      "account": "1",
      "customerAccount": "1"
    }
  ],
  "meta": {
    "currencies": {
      "USD": {
        "total": "569",
        "value": "11136200",
        "currency": "USD",
        "isDisabled": false
      }
    },
    "total": 569
  }
}

Appreciate your help. Thanks!

Do you want to see if there are duplicates in the deals array? If not, can you make up and example of json with duplicates? It does not have to be actual secret data.

Look at Enum.group_by/3 or Enum.frequencies/1

2 Likes

It’s 99% likely it’s doable just fine in Elixir, I’m simply asking what’s your time budget i.e. how quickly you should deliver. If it has to be done yesterday then the CLI tools have your back.

And please respond to the previous poster’s question. Show us a list of 5 records of which 2 are duplicates. You can cut most of the fields so the snippet here is not huge. Let’s see what makes records duplicates.