Finger, a verification library to prove hommaannnnnness of a user

I am only joking, I appreciate the help!

with 0.1.4

  • the 0 edge case is gone
  • added actual integration tests to solidify the stitching part

Now time to update the elxsy and clear the instructions with help

1 Like
3 Likes

If culturally it is too complex to make a solution that suits them all I am wondering if you could use pictures of random objects, like coins, pencils, forks, spoons, bottles, etc. for the same purpose. It may even be harder for AI engines to break.

1 Like

Those are interesting and good reads thank you! I Will definitely read more of the first link :slight_smile:

I basically did what you have thought of a decade ago before catcpha2 ripped it off in a really bad way again that people make memes off it again. I posted short history on this blog post but didn’t link to the library.

That picture is coming from the Webarchive 2009-06

The idea was to click on distinctively different objects, requiring user input that bots don’t have and allowing people to validate the result locally with a 3-4 line method that uses same signature as the api call result.

2 Likes

sorry, I mean put many random objects of the same type and make people count them instead of finger, like, : 5 coins, 3 forks, 1 cloth pin, 7 marbles.
People didn’t even need to know what the objects are, so long as the objects are distinguishable.

1 Like

I see, we are going down the google route there.

our vision speed, clustering >> detection >> analysis - counting.

Hand and fingers are native to us since childhood so I went down the analysis route identifying and counting some different objects will take slower time and chore some for many.

I will sleep on it.

I might resurrect IamHuman with my own image set as an open-source service again - then won’t have any trouble for copyright, royalties of publishes used pictures.

1 Like

Some of those are quite interesting. ^.^

I have had so many typos in my post, lack of sleep showed there! I can’t edit it now but thank you for understanding.

Two decades ago I had to do instruction and teaching courses as well as computer science and engineering courses

I studied both of these subjects and found them fascinating. Greatly helped me do visual games and simulations for challenging physics subjects.

Basis of them are very similar to CS, how humans operate (Behaviour - Kernel), and how humans learn (Cognitive - How to shape your programme)

If it’s any interest have a read.

1 Like

So far
It has been out in the wild for a couple of days but had 0 spam incoming and now time to reap the benefits

Setup and context

  • field is named as “fingers”, throws bots off, don’t know how to treat the field name
  • It is marked as required on HTML5 form as well as on the server validation which happens in the controller or Ecto right now.
  • It is a number field on HTML5, so only integer numbers can be entered given the browser supports it
  • A human is expected to enter numbers into it

I have been logging stuff on the “failed cases”

don’t worry omitting password and email fields before logging

Failed Registration from 5.166.201.102
%{"email" => "***@gmail.com", "fingers" => "", "name" => "Susanteete", "password" => "***"}

Failed Contact from 170.254.230.186
%{"email" => "***@edlen.com", "fingers" => "rdHoGzYqkDZLgVc", "message" => "mWAhJiaHvrMLSlR", "subject" => "YQJxLAFTXWkVl"}

Failed Registration from 170.254.230.186
%{"email" => "***@edlen.com", "fingers" => "NMFbSguTeYn", "name" => "ocJLmzwV", "password" => "***"}

You can check the reputation of IPs from this open community awesomeness (which I use to mass ban 20k bots via my firewall, god knows how much logs I would get if I hadn’t already)

https://www.abuseipdb.com/check/170.254.230.186

It is definitely a harmful spam bot.

Armed with these bits of information we could do the following.

Taking it to the next level

When the verification fails, don’t just disregard it, but pass the answer and given answer into the library analysis section. Could be a different supervisor tree or even offline, cron process.

Hoomannn vs Makina

  1. Is it empty? It can’t be empty for an actual user with a “text/html” browser
  2. Is it the required string length? Asked for 2 images how many did we receive?
  3. Does it contain non-numeric characters? It can’t or shouldn’t be for an actual user on html browser.

Ok matches total weight is higher than the threshold, very likely a bot but could be a human dicking around:

  • Verify the reputation and confidence with abuseipdb api
  • Send the report to abuseipdb, because even dicking around is abuse and should be reported. If it’s a genuine user then it will expire after a while without given no any other reports incoming.

Not enough confidence or evidence:

  • log and move on

Enough confidence and evidence:

  • Ban the IP via OS, network firewalls (ufw, iptables, ipset, WAF etc)
  • Preferably via a small driver package for each flavour

what do you people think?

2 Likes

When your library gets traction it will end-up in the human automated solving APIs:

We assign a worker for your captcha

100% of captchas are solved by human workers from around the world. This is why by using our service you help thousands of people to feed themselves and their families.
An average worker makes about $100 per month which is a very good salary in such countries like India, Pakistan, Vietnam and others. With your help they now have a choice between working in polluted industries and working in front of a computer.

So, you may want to give random names to each image you send for a challenge, like a timestamp based hash, thus it will make hard/impossible for this types of services to cache the result to each of your challenges.

2 Likes

ahaha loved that website, the animations especially the quality control side :smiley: thank you for that!

Currently, all images are stitched together as a final image with the name of your choosing. The software won’t be able to identify individual pictures.

From their instructions, it is apparently solved by actual humans so it will solve any captcha on the planet since it is designed to prove humanness :smiley: we need an anti-anti-captcha design that is only solvable by machines for those guys :smiley:! That will show them!

3 Likes

Bear in mind that this guys are just one of the many services of the type. Many more exist, and some may only exist in the dark-web.

The point is that its solved by humans only on the first API call done to their service, afterwards they return the cached result, otherwise it would not scale.

In my opinion this name should be unique for each time a challenge is made.

Yeah definitely, this is what I said on the blog post.

It is bespoke and good enough to deter the 99% and slow down the 1% - if they ever decide to bother with an already free and non-membership required services platform.

Well then in fingers case, it would work the first time and then not work again, you can’t cache the result, it changes every page load. (Unless you did something funky with the library)

Workflow:

GET /finger.jpg

finger_controller

> {answer, image} = Finger.generate(2)
> {"35", <<1, 3, 4, ... >>}
> session_put(answer)
> respond(image)

GET /important

important stuff template
<form action=important method-post>
<img src=finger.jpg>
<input type=number name=finger>
</form>

POST /important

important stuff controller
> if form.finger == session_get(finger) then do stuff

They can cache that jpg all they want it will be valid only first time and if it solved by a human.

you could define a random route for the finger controller also if you wanted them never caching that statically named dynamic picture.

/finger/:name 

<img src=finger/random_url_safe_string 32 >

If I were to actually deal with them I would send a zip bomb or corrupt chunked 100MB file everytime directly from nginx for their IP ranges when they wanted to cache something.

After I had my fun, I would drop the IP ranges at the FW perimeter

1 Like

Google’s recaptcha v3 (or was it v4… whichever was the one that uses an excess amount of JS to try to scan everything about you) is trying for that, I may not like all the tracking and JS stuff it does, but it works quite well at figuring out browser ‘usage’ and tracking and such to determine who’s a bot, all without showing a captcha to the actual person (unless certain criteria are met).

yes it does work quite well but unfortunately in the hands of an evil corp that snoops on everything you do.

3 Likes

You have others doing the same has Google reCAPTCHA V3, like:

Also a good alternative or addition is to use:

2 Likes

$100 a very good salary in Vietnam. :rofl:

2 Likes

unfortunately the reality of our world though

1 Like

They are not caching the image url or anything, they are probably just caching a simple fourier transform of it which will technically be a cache of that unique combination.

Those transforms are immune to resizing etc, and the only way you can break similarity scores is by warping the image, hence why a lot of the old captchas had words that looked like they were viewed through a lens.

You need to randomly (even inside what is technically the same permutation like 1-3-2-1) place the images and skew them in ratio and placement for such a caching to not work.

3 Likes

the famous FT hits us again :slight_smile:

Luckily very achievable with the way we stitch with ImageMagick. I am the only user so far, don’t think need to go to those levels just yet :slight_smile:

I am finding out people just enter the sums on the first try rather than reading the instructions.

changing to sums would drastically increase the possibility of a right guess from min 1/44 to 1/8 but that seems like human nature or conditioned internet-based behavior so far.

ps: given stats are for a set of n=2 images as I use on my website. the number gets smaller as you increase to max n=9

2 Likes