# Please help me improve my solution to the "Computing GC Content" challenge on the Rosalind site (bioinformatics topic)

I would like your help to improve my solution.
I resolved the code-challenge, but not is clear.

In my mind this can be much better with other people helping.

Eh, it could be shortened a bit and made a bit more efficient but not really that much more readable either. Iâ€™d probably have the actual calculation be something like this though (as it is significantly faster):

``````iex(1)> dna = "AGCTATAG"
"AGCTATAG"
iex(2)> Enum.reduce(to_charlist(dna), 0, &if(&1==?C or &1==?G, do: &2+1, else: &2))/byte_size(dna)
0.375
``````

Which if wrapped in a `case do ... end` then the whole thing could be pipelined into just a dozen lines or so.

Hereâ€™s a version with some helper functions split out, using `Stream` to eliminate intermediate lists, and binary matching to count the G and C characters:

``````defmodule Gc do
def gc_content(dataset) do
{key, gc_percent} =
dataset
|> parse_lines()
|> Stream.map(fn {k, v} -> {k, gc_percent(v)} end)
|> Enum.max_by(&elem(&1, 1))

"#{key}\n#{gc_percent}"
end

@spec parse_lines(String.t()) :: Enumerable.t()
def parse_lines(dataset) do
dataset
|> String.replace("\n", "")
|> String.split(">", trim: true)
|> Stream.map(&String.split_at(&1, 13))
end

@spec gc_percent(String.t()) :: float
def gc_percent(val), do: Float.round(100 * gc_count(val) / String.length(val), 7)

@spec gc_count(String.t(), integer) :: integer
def gc_count(val, n \\ 0)
def gc_count("", n), do: n
def gc_count("G" <> rest, n), do: gc_count(rest, n + 1)
def gc_count("C" <> rest, n), do: gc_count(rest, n + 1)
def gc_count(<<_::utf8>> <> rest, n), do: gc_count(rest, n)
end``````
