Miscellaneous – 2. Anticaptcha
Author: valrkey
Worth: $250
Description: Wow, this is a big captcha. Who has enough time to solve this? Seems like a lot of effort to me!
As you can tell by the tiny scroll bar, there were a large number of questions (609) to be answered. To make things more difficult, each time the question was visited, the order and numeric value would be randomized. The questions generally followed one of three formats:
Author: valrkey
Worth: $250
Description: Wow, this is a big captcha. Who has enough time to solve this? Seems like a lot of effort to me!
As you can tell by the tiny scroll bar, there were a large number of questions (609) to be answered. To make things more difficult, each time the question was visited, the order and numeric value would be randomized. The questions generally followed one of three formats:
- What is the # word in the following line: ...?
- Is # a prime number?
- What is the greatest common divisor of # and #?
For each of these question formats, I wrote a PowerShell function to determine the answer.
Word in Line
This function takes in the INDEX of the word requested and the LINE to take the word from. I added a line word length check just in case the IceCTF staff are jerks and give a too-large index. Everything should be accounted for the after mapping the 1st word" to the 0th array index and getting rid of any trailing '.' characters.
Prime Number
If a number is prime, it must not be evenly divisible by anything except itself and 1. I created a loop that runs from 2 to the sqrt(#) checking if each integer evenly divides our input VALUE. The odd case with this logic ends up being the VALUE 1, since it is not prime as it divides every other number in existence.
Greatest Common Divisor
This calculation can be accomplished a couple different ways; I chose the Euclid's algorithm since it lent itself to easy recursion and seemed to be more efficient than looping through all numbers below the lesser input VALUE and checking divisibility on both numbers.
Question Gathering
The Anticaptcha form was set in an iframe on the IceCTF platform site. After enabling all versions of HTTPS and supplying Referer and User-Agent headers, I was able to pull down the list of questions by sending a GET request to the iframe source site (https://<random chars>-anticaptcha.labs.icec.tf):
Question Parsing
To parse the questions out of the returned HTML, I started by looking at the page structure:
Since all of the questions and answer formats are contained in Table Data elements, I decided to target those:
This loop will gather each question and the desired answer format. I've found HTML parsing to be more of an art than a science, and this feat would be possible in a number of different ways. I happened to pick this TD elements because it made the most sense to me.
Answering the Questions
Using the functions I outlined above, I was able to test the questions against the following regular expressions in order to compute the answer:
The default action for this switch statements prints out a warning for any question that doesn't match one of the above three regular expressions. When running, I received the following warnings:
Looks like the IceCTF staff threw in some oddball questions for funsies, so I added an additional function to handle these questions specifically:
I added this new function to handle the "default" action of the Answer switch statement. The full function to get questions and answer them was:
Submitting Answers
Submitting the answers I found should be posted back to the Anticaptcha site based on the form action on the page. Inspecting a random answer submit on the site in Burp provided the post request body format of:
answer={ans0}&answer={ans1}...&submit="Submit+Answers"
Putting it all together, the end result was this function:
Checking the Content of the response, I noticed:
What the....
But...why...how, what...?
I beat my head against that wall for a while to no avail. For whatever reason, I was unable to mimic a browser submission, so I was forced to go about it a different route.
Browser Automation
I knew of this neat browser automation tool called Selenium that was designed for web development unit testing. Thinking I could get around my impersonation shortcomings by driving a real life browser, I looked up the project's PowerShell compatibility and discovered they had a module available.
With this module, you are able to point your puppet browser (Firefox in my case) at websites, select HTML elements, and send key strokes/click events to selected elements. Should be perfect!
After navigating my puppet browser to the Anticaptcha challenge site, the following code filled in answers in the input boxes on the page and clicked the submit button when finished:
Video of the driven browser:
The final result:
Flag = IceCTF{ahh_we_have_been_captchured}
Comments
Post a Comment