Postgrex + Redshift. Anyone?

ecto
databases

#1

Hey folks,
Does anyone have experience using Postgrex with AWS Redshift, preferably recently and/or in production?
My team is looking to use Elixir to be a broker for a lot of our ETL data engineering work, and we make heavy use of Redshift, among other tools that traditionally have very good support for JVM languages.

I know there has been work done on this front before, but I’m unclear as to where it sits now. For example:

https://groups.google.com/forum/#!msg/elixir-lang-talk/QGB5Tcr18ms/-56rADBMEQAJ points to this fork that is at this point a little old and unloved: https://github.com/colinbankier/postgrex

as well as some stuff on the issue tracker for Postgrex: https://github.com/elixir-ecto/postgrex/issues?utf8=✓&q=redshift

Greatly appreciate any reports/help!


#2

i use postgrex with redshift internally. you need version 0.13.1 or later (not 1.0.0-rc.0 or 1.0.0-rc.1 which predate the 0.13 series) but it should work fine


#3

@talentdeficit how is this working out for you? Are you using this along with Ecto?


#4

i don’t use ecto with that particular system. it may generate sql that redshift can’t handle (as it supports only a subset of what postgres supports) but i think if you stick to queries that redshift can handle it would be fine


#5

HI,

Just wondering what you ended up using?

Thanks! =)


#6

Hi @demem123,

The system is now in production and works well.

We’ve gone with using Ecto to run stringified/templated SQL against our Redshift cluster. The system is purely internal and only runs trusted queries, so we elected to go this route despite string interpolation in SQL being wildly insecure otherwise.

One thing working in our favor is that the queries we’re running against Redshift using this method are 1. largely administrative, 2. simple, 3. rarely change. For those reasons, we haven’t felt a great need to use Ecto’s query builder or anything on that level of sophistication.

Overall this works well enough for our particular workflow, but I would definitely not use this method if any piece of the system was public facing.


#7

thanks! I started trying out postgrex as suggested by the other user. Just wondering do you do upserts on your redshift data? It sucks that in order to update records, i have to delete and re-insert as per redshift document.
Thanks! =)


#8

redshift doesn’t support upsert and the redshift client library is actually just postgrex so if you try to use ecto upserts with redshift you’ll get sql errors in your queries


#9

Thanks! I’m sticking with postgrex then. Do you have to use a pool to manage db connections. poolboy? I think Ecto repo has a pool of connections to work off of.


#10

postgrex has pooling built in. see the options for start_link


#11

thanks! will check it out. I think the last time I tried, i used a child_spec and it didnt create a pool. :stuck_out_tongue:

Thanks so much! =)


#13

Since you are also using Redshift from Elixir you might be interested that we released RedshiftEcto (Redshift adapter for Ecto) yesterday. It aims to provide a good experience working with a Redshift database from Ecto. Features unsupported by Redshift will raise an error and the adapter supports table and column options specific to Redshift in migrations.

There’s a separate topic for the announcement:


#14

I’ve had very little luck getting RedshiftEcto to connect to my redshift instance. Has anyone else gotten this working?


#15

We’re using it in production without any issues. If you tell me a little more about your setup I might be able to help.


#16

We’re in a VPC and I was trying to tunnel through a bastion box to connect from my machine. Otherwise nothing is really special, it’s phoenix 1.4, ecto 2.2, and that’s about it. We we just getting timeouts on any attempt to connect.

We may try to run this on regular Postgres, and see if that meets out latency goals.


#17

We have a similar setup, but using a VPN instead of ssh tunnelling to connect to a private Redshift instance inside the VPC’s private network.

I just tested it with an ssh tunnel instead and that works for me. Can you otherwise connect to Redshift through the same tunnel with a regular client (e.g. psql or a desktop app)?


#18

Yes, I could connect with Datagrip.