Ets size remains at 707

Hi,

i created a new table and inserted 50k tuples into it and then created a new table and inserted 100k tuples into it but strangely their sizes are same. I dont know if ets has any limit of saving tuples.

cool_table = :ets.new(:cool_table, [])

data =  
File.read!("data/insert/50000.csv") 
|> String.split("\n") 
|> Enum.map(&(String.split(&1, ",") 
|> List.to_tuple))

:ets.insert(cool_table, data)

The information that i got from these two table is below

 :ets.info(cool_table_50000)
[
  id: #Reference<0.192535627.862584833.245646>,
  decentralized_counters: false,
  read_concurrency: false,
  write_concurrency: false,
  compressed: false,
  memory: 22037,
  owner: #PID<0.105.0>,
  heir: :none,
  name: :cool_table_50000,
  size: 707,  # its exactly same for 10000 tuples
  node: :nonode@nohost,
  named_table: false,
  type: :set,
  keypos: 1,
  protection: :protected
]

for 100k tuples
[
  id: #Reference<0.192535627.862584834.97867>,
  decentralized_counters: false,
  read_concurrency: false,
  write_concurrency: false,
  compressed: false,
  memory: 22069,
  owner: #PID<0.105.0>,
  heir: :none,
  name: :cool_table,
  size: 707,
  node: :nonode@nohost,
  named_table: false,
  type: :set,
  keypos: 1,
  protection: :protected
]

Is it becase ets only has store this much on main memory and then retrieve remaining from disk. If this is the case then how can i get the accurate results.

Thanks

ETS does not use disk storage (that is DETS), so that is not the reason at least.

Since your table is a :set, could it be that you have many items with the same key (first item in the tuple)? Only one of those would be stored.

Probably has something to do with how memory is being handled and the efficiencies/optimizations in ets:

https://erlang.org/doc/efficiency_guide/advanced.html#memory

This stack overflow mentions two other erlang functions you can try: :erlang.memory/1 and :ets.i/0

The stack solution was something like:

:ets.info(Table,memory) * :erlang.system_info(wordsize)
1 Like

Seconding Nicd’s comment. It sounds like your CSV only has 707 unique values in the first column.

1 Like