Interesting Data Gigs # 7: Engineering Manager, Machine Learning at Truebill
Why must follow Mikkel Dengsøe (Monzo Bank) and Alejandro Saucedo (Seldon)
🚨 Join the Interesting Data Gigs Talent Network 🚨
It’s the perfect time to be part of The Interesting Data Gigs Talent Network, where you will find amazing Data Analytics jobs from companies like Netflix, Apple, Consensys, and many more.
Let’s change the game together: Instead of people applying to companies, companies will pitch to you, so don’t wait any other moment and join today.
Hey Data lovers, it’s Marcos again with a new edition of the Interesting Data Gigs newsletter.
This time, we will talk about a very interesting data gig open at Truebill. The open position is called Engineering Manager for Machine Learning, and you can find it on the Interesting Data Gigs Jobs Board. (That’s why I wanted to see you there).
Truebill is one of my favorite personal finance apps out there, and very close to my heart. Why? Let me explain.
Did I share with you that I run a YouTube channel (Spanish-based) that is focused on personal finance, investments, and that kind of stuff? If you want to check it out, feel free to do it here.
One of my missions in my life is to teach Hispanic people all these complex financial concepts and make them understand in an easier way why they need to invest, why they need to have an emergency fund, they need to have a clear budget, etc.
Truebill is perfect for this, and more. So, when I saw the chance to write about it, I said: “let’s do it“.
But, let’s find out what is actually Truebill?
According to the Press website of the company, this is an accurate description of it:
It’s basically an app where:
You can manage your subscriptions (my top favorite feature)
It helps you to save more money with its Autopilot Saving feature
It helps to understand the spending habits you have, and how to improve that
So, they got into Y-Combinator and they raised a total of $83.9 Million in outside capital, before the acquisition by Rocket Companies by $1,275 Million in cash.
Let’s dissect the role we are discussing today and discuss some ideas on how to approach this job application (THE REAL MEAT)
Download the app ASAP
My first advice is always the same:
Even if you don’t get the role, you will love this product, and it could help you to find those obscure subscriptions you didn’t know that are out there.
Now, let’s review the job description closely and discuss some key points here:
Machine Learning is a key component of Truebill’s underlying technology, so you have to be a real Machine Learning Practitioner here, especially with actual experience deploying models to production.
Again: this is a practitioner role. You have to actually write code here, not only manage the team.
This part is amazing and the same time very challenging, because with the growth of Truebill, I truly believe they can reach this scale very quickly.
In today’s environment where many people are working in hybrid mode (some days in the office, some days from home), being a clear communicator is key, and being a clear writer is even better, especially with this kind of system that is very complex.
This is a very interesting point as well: for example if you use Google Cloud Platform for model building, you can use work with incredible things like Cloud Dataflow (based in Apache Beam) for batch and streaming analytics efforts or Vertex AI, or in the case of AWS, you can use Sagemaker and the whole ecosystem behind it.
Two amazing resources I can recommend here are these two amazing books:
Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps, by Valliappa (Lak) Lakshmanan, Sara Robinson and Michael Munn, focused on GCP-based solutions
Data Science on AWS: Implementing End-to-End, Continuous AI and Machine Learning Pipelines, by Chris Fregly and Antje Barth focused on AWS-based solutions
Chat with people that “has walked the talk”
The second piece of advice I can give you here is that you should reach out to people that have built these systems from scratch in heavily Data-Driven companies like Stitch Fix, Netflix, Facebook, Amazon, PayPal, Cash App, etc.
I will let you 3 potential candidates to invite for a coffee:
The whole idea here is to discuss some ideas, approaches, and best practices to build large-scale Machine Learning based systems. People love to chat about this, so be a sponge here.
Chat with people from the company
This is always welcomed: This proves that you are actually interested to be part of the company, and if you come up with good ideas, even before taking the role, even better.
I will let here two Eng Managers:
Take all this and make a killing in your application.
Other featured jobs from the Interesting Data Gigs Job Board
Why should you follow these guys? Let me explain why.
In the case of Mikkel, he is working right now as the Head of Data Science, Operations & Financial Crime at Monzo Bank.
What does means to you? It’s very simple: the wisdom and knowledge about managing Data teams he is sharing on LinkedIn and his newsletter on Substack are simply amazing.
Two of my favorite posts from Mikkel are these ones: the one where he explained Data salaries at FAANG companies; comparing the U.S with Europe-based salaries, and this one explaining the Data team structure.
Subscribe to his newsletter. Believe me: you will grow as a Data people manager.
In the case of Alejandro: if you are a Machine Learning practitioner and you are not subscribed to his awesome newsletter, stop what are you doing here, and please subscribe to it. If you prefer LinkedIn, the link to the newsletter is this one.
In every edition, he curates a lot of great resources related to Machine Learning, so from my perspective, this is a unique and amazing way to follow the trends in the field, and keep your knowledge fresh on it.
For example, in the last edition ( No. 184), he shared the amazing new Machine Learning course from Andrew Ng on Coursera, an incredible MLOps course that is completely free, and many more cool things.
Seriously: you need to follow his work right now.
Interesting Open Source projects related to Data
Modin: Is a drop-in replacement for pandas. While pandas is single-threaded, Modin lets you instantly speed up your workflows by scaling pandas so it uses all of your cores. Modin works especially well on larger datasets, where pandas become painfully slow or run out of memory. You can download it here: and if you have any questions, chat with Doris Lee, Aditya Parameswaran, and Devin Petersohn from Ponder
data-diff: is a command-line tool and Python library to efficiently diff rows across two different databases.- Verifies across many different databases (e.g. PostgreSQL - > Snowflake)- Outputs diff of rows in detail- Simple CLI/API to create monitoring and alerts- Bridges column types of different formats and levels of precision (e.g. Double ⇆ Float ⇆ Decimal)- Verify 25M+ rows in < 10s, and 1B+ rows in ~5min.- Works for tables with 10s of billions of rowsIf you have any questions about data-diff, chat with Ilia Pinchuk & Clay Moeller from Datafold
hamilton: A scalable general-purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc. If you have any questions, chat with Stefan Krawczyk
Interesting resources of the week
Applying federated learning to protect data on mobile devices, by Meta Engineering
If you’re finding this newsletter valuable, consider sharing it with friends, or subscribing if you haven’t already.