As you can see it has a lot of newline ‘\n’ characters. These are fine for the most part, except where they are contained in the binary. So the question is how do I remove them from the parts of the string that represent the binary, but not the rest of the string?
I’d thought of Code.eval_string and use the IO libraries to convert it back, but the only thing that does is remove all new-lines, plus the string normally contains pids in this form “#PID<0.1412.0>” so that doesn’t help.
I did cook-up this lambda to do it, but it seems like a simpler way must exist:
my_trim = fn(string) ->
{parsed,_} = String.codepoints(string)
|> Enum.chunk_by(fn e -> e == "<" or e == ">" end)
|> Enum.map_reduce(:no,
fn(block, _acc) when block == ["<", "<"] -> {List.to_string(block), :yes};
(block, _acc) when block == [">", ">"] -> {List.to_string(block), :no};
(block, :yes = acc) -> {List.to_string(block) |> String.replace(["\n"," "], ""), acc};
(block, acc) -> {List.to_string(block), acc}
end)
List.to_string(parsed)
end
This is an interesting question but before we try to answer it specifically, can you elaborate on your use case? Trying to parse a string out of arbitrary Elixir code is going to be a lot harder than just having the string in a more easy to access format, is there no way around this problem?
EDIT: As a first step in solving the problem as stated I would check out this function: Code.string_to_quoted which will parse the Elixir code and turn it into AST. String literals are then pretty easy to pull out from there.
Basically we convert the structure to a string that gets dispatched via a REST call to a python application for processing and display. I’d rather not change the python app since the original code owner has left the company. Plus my python skills are rusty at best.
The inspect protocol is not an interchange format. What is complex_structure? You should use regular Elixir functions to turn that into a value that you send, you should not try to inspect it and then parse the inspect result.