Help with phoenix application

lucianoengel · August 1, 2017, 5:32pm

Hello everyone!! This is my first post in this forum, although I’ve been around long enough already to fall in love with this community and elixir/phoenix. I’m quite new to elixir and phoenix and recently I’ve started building a blockchain explorer with it (https://github.com/CityOfZion/neo-scan). It has been a tough path but I’m liking it :). I want now some feedback and suggestions to improve the application since I’m no expert. Also, I’ve some problems that I don’t fully understand. The project is an umbrella with three applications, the phoenix endpoint, the repo and a sync application that fecthes data from the blockchain and syncs the DB.

Question 1: Do I need to wrap each application under an server/API and communicate between them through messages? Right now I’m just calling functions from each other modules…

Question 2: My sync application has two workers, one that makes concurrent http requests to the blockchain node and stores raw data, and other that builds the DB based on the data previously stored. I want them to work in parallel, but so far I cant’ make it work, only one at a time. I’ve tried separating them in different applications but had the same result, the other worker just starts when the first one ends. What am I missing?

Question 3: To store a transaction in the DB it needs to fetch data from it through queries first (to link blocks/transactions), sometimes, when there’s too much data to be queried erlang throws a memory error (heap_alloc: Cannot allocate x bytes of memory of type “heap”). How to deal with this?

Question 4: in my home controller there’s a query for the last elements in the “blocks” and “transactions” table. Since each table has more than a million elements it is taking forever to do the sorting and retrieve. I’ve tried using indexes in descending order but the result was the same, though postgres writting time considerably increased. Am I missing something?

Sorry for the long post, I’ve been trying to sort everything on my on but now it felt like I could use some help. Any other tips and suggestions are appreciated!

Thanks!
Luciano S. Engel

OvermindDL1 · August 1, 2017, 7:19pm

Oooo, awesome!

A defined ‘function’ API is perfect. Messages should usually only be internal details to an API.

You should learn GenStage.

Uhhh, how much data is this?!? o.O?!
What are you calling and how and what is the code around it?

How is the table structure in the DB along with your indexes?

lucianoengel · August 1, 2017, 7:49pm

Hey! Thanks for the reply :).

Each transaction has inputs and outputs, but also each transaction input was an output in a previous transaction. The
transaction raw information just pass a reference to this previous transaction in the inputs. When I store a transaction in the DB I already fetch this reference from the DB and save the transaction with the complete data, not just a reference. So I have a Table for outputs, that are also used as inputs for following transactions. My tables are as follow:

defmodule Neoscan.Transactions.Transaction do
  use Ecto.Schema
  import Ecto.Changeset


  schema "transactions" do
    field :attributes, {:array, :map}
    field :net_fee, :string
    field :scripts, {:array, :map}
    field :size, :integer
    field :sys_fee, :string
    field :txid, :string
    field :type, :string
    field :version, :integer
    field :vin, {:array, :map}


    field :time, :integer
    field :block_hash, :string
    field :block_height, :integer

    field :nonce, :integer
    field :claims, {:array, :map}
    field :pubkey, :string
    field :asset, :map
    field :description, :string
    field :contract, :map

    has_many :vouts, Neoscan.Transactions.Vout
    belongs_to :block, Neoscan.Blocks.Block

    timestamps()
  end

defmodule Neoscan.Transactions.Vout do
  use Ecto.Schema
  import Ecto.Changeset

  schema "vouts" do
    field :asset, :string
    field :address_hash, :string
    field :n, :integer
    field :value, :float
    field :txid


    belongs_to :transaction, Neoscan.Transactions.Transaction
    belongs_to :address, Neoscan.Addresses.Address
    timestamps()
  end

Every output to a new (never mentioned) address also triggers the creation of this address in the DB. New assets and claims of assets are also triggered. So everything happens under the create_transaction function below:

  def create_transaction(%{:time => time, :hash => hash, :index => height } = block, %{"vout" => vouts, "vin" => vin} = attrs) do

    #get inputs
    new_attrs = cond do
       Kernel.length(vin) != 0 ->

         new_vin = Enum.map(vin, fn %{"txid" => txid, "vout" => vout_index} ->
           query = from e in Vout,
           where: e.txid == ^txid,
           where: e.n == ^vout_index,
           select: %{:asset => e.asset, :address_hash => e.address_hash, :n => e.n, :value => e.value, :txid => e.txid}

           Repo.one!(query)
         end)

         Enum.map(new_vin, fn vin -> Addresses.insert_vin_in_address(vin) end)
         Map.put(attrs, "vin", new_vin)
       true ->
         attrs
    end

    #get claims
    new_attrs1 = cond do
       attrs["claims"] != nil ->

         new_claim = Enum.map(attrs["claims"], fn %{"txid" => txid, "vout" => vout_index} ->
           query = from e in Vout,
           where: e.txid == ^txid,
           where: e.n == ^vout_index,
           select: %{:asset => e.asset, :address_hash => e.address_hash, :n => e.n, :value => e.value, :txid => e.txid}

           Repo.one!(query)
         end)

         Enum.map(new_claim, fn %{:txid => txid} -> Addresses.insert_claim_in_addresses(vouts, txid) end)

         Map.put(new_attrs, "claims", new_claim)
       true ->
         new_attrs
    end

    #create asset if register Transaction
    cond do
      attrs["asset"] != nil ->
        %{"amount" => amount} = attrs["asset"]
        {float, _} = Float.parse(amount)
        new_asset = Map.put(attrs["asset"], "amount", float)
        create_asset(attrs["txid"], new_asset)
      true ->
        nil
    end

    #create asset if issue Transaction
    cond do
      attrs["type"] == "IssueTransaction" ->
        Enum.map(vouts, fn %{"asset" => asset_hash, "value" => value} ->
          {float, _} = Float.parse(value)
          add_issued_value(asset_hash, float)
        end)
      true ->
        nil
    end

    #prepare and create transaction
    transaction = Map.put(new_attrs1,"time", time)
    |> Map.put("block_hash", hash)
    |> Map.put("block_height", height)
    |> Map.delete("vout")

    Transaction.changeset(block, transaction)
    |> Repo.insert!()
    |> create_vouts(vouts)
  end

The block migration is the following:

defmodule Neoscan.Repo.Migrations.Blocks do
  use Ecto.Migration

  def change do
    create table(:blocks) do
      add :confirmations, :integer
      add :hash, :string
      add :index, :bigint
      add :merkleroot, :string
      add :nextblockhash, :string
      add :nextconsensus, :string
      add :nonce, :string
      add :previousblockhash, :string
      add :script, {:map , :string}
      add :size, :integer
      add :time, :integer
      add :version, :integer
      add :tx_count, :integer

      timestamps()
    end

    create index(:blocks, ["index DESC NULLS LAST", :hash], unique: true)

  end
end

Thanks a lot!! I’ll look into GenServer!

PatNowak · August 1, 2017, 8:15pm

Hi and welcome on this forum! It’s great that you started your journey with Elixir and Phoenix!

Basically in Elixir (and Phoenix) as Chris mention all the time the rule is to use “modules and functions with good names” and most of the time you will call function from another module. That’s pretty much it. You can call it in the another process (when using GenServer or task), but still - you call this function anyway.

When you have issues with performance when dealing with huge amount of data in DB, consider adding more indexes and try to optimize your queries not to search through the whole table. Also consider using Flow - it might help. IMO using Flow is a bit easier than using GenStage directly, because Flow handles the concurrency for you.

It would be very beneficial for you to split the huge function into bunch of smaller ones. It will help you organize, test and clean your code. You have to also get advantage of pattern matching, but the best way to do so I already mentioned - split code into smaller functions and use pattern matching in their function definitions. It would allow you to get rid of cond everywhere and make your code much more readable.

Another important thing is to avoid having business logic (fetching from DB etc.) in the controller. Use the context for that (if you are using Phoenix 1.3) or helper (if not). You can also even have different modules for queries and different for communication with DB, but it’s up to you.

BTW: Transaction is something that’s strictly related with the database. I guess that in your example is something different - it’s not the real transaction, but somekind of schema (model). Please consider using different name.

I hope you will not be upset by my comments. Glad that you are here Enjoy Elixir

OvermindDL1 · August 1, 2017, 8:28pm

Whoa you are doing a lot of queries I’d imagine then, inside the Enum.map. Might be better to craft a query to get a lot of things at once, this is probably the slow bit just by looking at it. ^.^;

As with any database using any language, you want to minimize your number of database calls, such as by using a more complex query to get out more data at once instead of simpler queries calling in to it a lot.

lucianoengel · August 1, 2017, 8:35pm

Hey Thanks for the reply! I’m not upset haha, we’re here to learn !

Sorry, I know I should’ve split everything. I didn’t know I would need those “appends” in the code.
The documentation on the blockchain isn’t complete, so I’m kind of understanding how it works as I code (Still don’t understandt it completely). Once I get the full picture I’ll refactor this whole function.

Transactions is a table for information of real transactions of assets (cryptocurrencies)… Changing this name may make the code clear from this confusion but may cause confusion to people looking for these transactions in the code… I dunno.

I’ll follow your advice and take away DB requests from the controllers, and look into Flow!

Thanks!!

lucianoengel · August 1, 2017, 8:43pm

Humm that’s a good point!! I dunno How to make this query though… is it possible to define an array of queries and fecth then all at the same time through Repo.all()?

OvermindDL1 · August 1, 2017, 9:09pm

Not really (kind of, but don’t), the better way would be to use normal SQL to grab what you need, like for your first Enum.map body the query could probably be closer to just:

lookups = Enum.map(vin, &"#{&1["vout"]}#{&1["txid"]}"

query =
  from e in Vout,
  where: fragment("CAST(? AS text) || ?", e.n, e.txid) in ^lookups
  select: %{:asset => e.asset, :address_hash => e.address_hash, :n => e.n, :value => e.value, :txid => e.txid}

new_vin = Repo.all(query)

Or something like that. Just need to grab them all in one big go, you’d do it the same way as you’d do SQL in any other language.

lucianoengel · August 1, 2017, 11:29pm

That seems to work!! Thanks!!

OvermindDL1 · August 1, 2017, 11:32pm

Hah really? On my first try? Nice. I just typed that straight into the post area and hoped. ^.^

Is it faster once you change out both of the map’s with one calls into that format? I’d think it should be substantially faster.

lucianoengel · August 1, 2017, 11:36pm

I’m trying that right now! It will take some time though… It needs to fetch the chain first, since I’m still to solve the genStage issue haha.

lucianoengel · August 3, 2017, 7:06pm

It is working! I had to rethink a lot of my queries, but now everything seems to be a lot faster and no more memory errors

Thank you everyone! App is now live in https://www.neoscan.io/

OvermindDL1 · August 3, 2017, 7:36pm

Hah, built-in API docs and all, that is nice. ^.^

lucianoengel · August 3, 2017, 7:56pm

It wasn’t easy, but elixir does pay up the hard work haha