About Toptal
Before starting my story, I want to tell you a little bit about Toptal, but just a little because this post is mainly focused on to interview process. Toptal is a popular freelancing platform that hires the top 3% of engineers, developers, designers, and managers around the world. There are many advantages of it such as consistent high rates which means you do not need to find a job yourself like other freelancing platforms, you work with both top-level engineers and clients and there are many other benefits. The only disadvantage I caught is getting into Toptal :). You can find other details yourself.
I wanted to challenge myself and applied for a Data Engineer position. I researched a lot from many resources for better preparation, I read medium blogs, looked through glassdoor interview questions, and asked tips from Toptal data engineers. However, at the end of the day, I found very little information about the interview process for the data engineering position. Most of the resources are mainly for backend, frontend, or mobile engineering roles, there was almost no info from data engineers. So I wanted to share my experience even though I could not go until the hiring stage, I hope it will be useful.
Toptal Screening Process
The full screening process takes about 2–4 weeks, in my case, it took more than 2 weeks. Stages:
- Language and Personality
- In-Depth Skill Review
- Live Screening
- Test Project
Round Zero: Applying
Before submitting my details, I focused more on Python by solving Leetcode problems in order to refresh my data structures and algorithm skills. I mainly practiced on easy-medium level questions from arrays, strings, hashtable, and a little bit of dynamic programming for a week. After a week I submitted my details by signing up.
After completing the details, you will choose the type of initial interview, either a live 10-minute interview or a recording of a video, I preferred a live interview and chose a time slot. Once I finished, one of the communications specialists reached out to me via email with instructions prior to the introductory call. So in short, a late arrival to your interview may result in your application being paused for 60 days. The interview will be on BlueJeans web conference software
Round One: Language and Personality test
In this round, usually, there won’t be any technical questions, the interviewer talks a bit about Toptal and asks general questions about your experience, knowledge about technologies, past projects, contributions, and others.
I passed this round, to be honest, I didn’t find this round very difficult, I think the main evaluation will be on communication and personality skills. Due to NDA, I can’t share the exact questions, however, prepare your answers around these questions:
- Tell me about yourself
- Why Toptal, what have heard about it?
- Tell me about a time when you made a mistake and how you handled it during your experience?
- What technologies you are most passionate about and how proficient are you at them?
- What is your experience with technology A or B?
Due to the Toptal, about 74% of applicants fail this round.
Round Two: In-Depth Skill Review
In this round, you should pass an online timed assessment based on your tech stack skills on Codility. Typically, it will consist of three questions similar to LeetCode medium-hard level questions. Different from other coding platforms, Codility checks your code also for some other factors including performance and correctness.
After passing the first round, I received an invitation email to the second round.
Until coming to this stage, I had been preparing for only algorithmic coding questions. As you see in the email, the second round consists of two SQL questions and a Data Processing task. What? I was a little bit nervous after reading that. Actually, I didn’t know that before, because there was almost no info about the Data Engineering Codility task and I had only 72 hours.
How I prepared?
Fortunately, I was aware of SQL, however, I needed to improve. I had only 72 hours and I didn’t want to waste time. I started preparation immediately.
As for SQL, I practiced and solved all easy, medium, and hard level free questions of the Database category on Leetcode and Codility. I also practiced HackerRank SQL questions, again from easy-medium-hard levels. I solved at least 15–20 questions per day.
After practicing SQL for two days, I focused on Data Processing from the 3rd day. Thankfully, the recruiter prolonged the deadline for another 2–3 days. I tried to use the time wisely as much as possible. I mainly focused on Python and Pandas. I watched and read from Youtube and Medium tutorials and finished some courses from DataCamp such as :
- Data Manipulation with Pandas
- Data Manipulation with Python
- Reshaping Data with Pandas
- Analyzing Police Activity with Pandas
In summary, I spent almost at least 6–7 hours per day for preparation. I finished the first two SQL tasks in about 50 minutes with a 100% score. I got pretty stuck on the 3rd task until some idea popped into my head in the last 15 minutes. However, I can’t manage to implement my solution due to the time limit. Thankfully, I passed this round also successfully🎉. However, this round became quite challenging for me.
Again due to Toptal statistics, only 7.4% of applicants pass this round
Round Three: Live Technical Screening
The third round is also one of the challenging rounds of the interview process. Only 3.6% percent pass this round. This round will be about 60 minutes with Senior Toptal Engineer by sharing your screen on BlueJeans software. It consists of a discussion about your solutions for the second round to evaluate whether you solved yourself or not, questions about your experience, tech stack knowledge, and two coding questions similar to the second round but quite simpler. In order to pass this round. Each question must be solved within 15 minutes with a 100% test case passing. I think, most probably the complexity of the algorithms doesn't matter, there won’t be time limit errors, because you solve the problems on your local IDE. Your solution just needs to pass all test cases.
Again I received email instructions from another technical screener to schedule a slot for the interview, I chose the latest available option which was 5 days later than the day I received the email.
I had 5 days to prepare. And again…, I had no info about question types for this round for the Data Engineering position neither on Medium nor on Glassdoor. So I wrote an email to my technical screener asking about interview details, questions types, or any other information which may help to better preparation. Unfortunately, he informed me that he couldn’t share many details. However, he told me that I should expect questions related to a specific skill set and solve questions similar to the second round, and prepare a local environment for SQL to create tables and execute SQL queries.
How I prepared?
From the previous message, I think it is not difficult to know the priorities for that upcoming interview :). So I needed to focus mostly on two areas: tech stack knowledge and SQL.
About tech stack knowledge, I explored most common data engineering concepts such as Data Modeling, Data Warehouse, Database Systems, ETL pipelines, Distributed Computing, Cloud Computing. Of course, I knew that these skills can not be acquired in just 3–4 days, however, I did not waste time and I did my best to improve my knowledge as much as possible. I watched and read mini-tutorials, documentation by spending about 1–3 hours on each skill mentioned above. I looked through the most asked interview questions. I also finished some courses on DataCamp including:
- Building Data Engineering Pipelines in Python
- Introduction to Airflow in Python
- AWS Cloud concepts
- Cloud Computing for Everyone
- Introduction to AWS boto in Python
- Big Data fundamentals with PySpark
- Snowflake in 20 minutes (not DataCamp)
As for the SQL, to be honest, I was somehow dissatisfied with Leetcode and Hackerrank SQL questions after solving Codility second round. They seemed quite old to me. So I researched other resources to practice and learn. Finally, I came up with 2 awesome resources for acing SQL interview questions: stratascratch.com and interviewquery.com. Both of them provide up-to-date SQL, Python(Pandas) interview questions asked at Google, Facebook/Meta, Amazon, Twitter, Airbnb, and other top companies. But you need to pay for their content.
However, Stratascratch offers also free questions with video explanations, so I went for it and I solved all the free questions with all difficulty levels. At first, they became quite tough, but after getting my hands dirty for some time, I started feeling comfortable even with medium-hard questions. In total, I solved about 10 easy, 15 medium, and 5 hard questions. For the first two days, I solved questions without a timer and just learned some techniques, tips, and how to approach medium and hard queries. From the 3rd day, I started practicing to solve in a 15-minute time limit in the local environment(SQL Server Management Studio).
To be honest, I was solving most medium-level questions in about 8–10 minutes on the fifth day. Through my experience, what I found out after practicing these kinds of questions is that skills like subqueries, window functions, date manipulation, string manipulation, self joining techniques, aggregate functions are must-know skills if you want to ace your SQL coding interview. In summary, I didn’t go out for 3 days and spent about 8–9 hours per day to prepare, 4 hours for the tech stack skills, and 4 hours for SQL.
In the interview, the interviewer asked some general questions about my background and experience. And after seeing my resume he said that unfortunately, we couldn’t go on with my application due to my official experience being less than 2 years. Actually, it was very disappointing for me to hear this message at that time after such an amount of preparation. And I asked him to continue the process and I told him that I wouldn’t say anything if I couldn’t pass the SQL coding questions because I prepared a lot. But he explained to me they couldn’t hire me even though I would pass the interview because of their contract with the clients. Toptal makes a resume like this for you and shares it with their clients, clients should see at least 2–3 years of experience on it. Even though it made me sad, from the shoes of clients, I understood this situation right. However, the interviewer said that they did not want to reject my application and they put my application on hold. They informed me that whenever I go again with 2 years of experience on my resume, I will be able to continue my application from the current round, and this made me pretty happy :) I liked the way how Toptal treats both clients and engineers. So…, I can not give you more info about this round.
Round Four: Test Project
I can’t tell anything about this round, because I couldn’t come until this round. Due to the statistics, 3.2% of applicants pass this round
At the end of the day, it has been another cool experience for me. I learned a lot, the list of the skills that need to be acquired has been updated, became more confident. By the way, sorry for the long story, but I need one like this when I was searching resources before the interview, so I did it myself.
Good luck!