Interesting Data Gigs by Marcos Ortiz
Posts
Interesting Data Gigs # 7: Engineering Manager, Machine Learning at Truebill

Interesting Data Gigs # 7: Engineering Manager, Machine Learning at Truebill

Why must follow Mikkel Dengsøe (Monzo Bank) and Alejandro Saucedo (Seldon)

🚨 Join the Interesting Data Gigs Talent Network 🚨

It’s the perfect time to be part of The Interesting Data Gigs Talent Network, where you will find amazing Data Analytics jobs from companies like Netflix, Apple, Consensys, and many more.

Let’s change the game together: Instead of people applying to companies, companies will pitch to you, so don’t wait any other moment and join today.

Hey Data lovers, it’s Marcos again with a new edition of the Interesting Data Gigs newsletter.

This time, we will talk about a very interesting data gig open at Truebill. The open position is called Engineering Manager for Machine Learning, and you can find it on the Interesting Data Gigs Jobs Board. (That’s why I wanted to see you there).

Truebill is one of my favorite personal finance apps out there, and very close to my heart. Why? Let me explain.

Did I share with you that I run a YouTube channel (Spanish-based) that is focused on personal finance, investments, and that kind of stuff? If you want to check it out, feel free to do it here.

One of my missions in my life is to teach Hispanic people all these complex financial concepts and make them understand in an easier way why they need to invest, why they need to have an emergency fund, they need to have a clear budget, etc.

Truebill is perfect for this, and more. So, when I saw the chance to write about it, I said: “let’s do it“.

But, let’s find out what is actually Truebill?

According to the Press website of the company, this is an accurate description of it:

Truebill is a leading personal finance app that analyzes members' spending habits, identifies inefficiencies, and offers immediate methods to improve their financial health. It enables people to optimize their spending, manage subscriptions, lower their bills, and automatically set aside money to reach their savings goals. Truebill has saved members more than $245 million since 2016 and is headquartered in Silver Spring, Maryland, with offices in San Francisco.

Truebill's mission is to empower people to live their best financial lives. Truebill offers members a unique understanding of their finances and a suite of valuable services that save them time and money - ultimately giving them a leg up on their financial journey.

In late 2021, Truebill was acquired by Rocket Companies for $1.275 billion in cash. Truebill's ability to leverage technology to constantly improve their members' financial health adds to Rocket Companies' end-to-end real estate and home financing platform. Meanwhile, Rocket’s extensive user base and tools allow Truebill to extend outreach and seamlessly connect consumers with even more financial services. Truebill is now more equipped than ever to deliver products that help members achieve financial preparedness during life’s complex moments.

It’s basically an app where:

You can manage your subscriptions (my top favorite feature)
It helps you to save more money with its Autopilot Saving feature
It helps to understand the spending habits you have, and how to improve that
It helps you to improve your credit score and they can negotiate your bills for you to lower them

The company was founded by three brothers Haroon Mokhtarzada (CEO), Yahya Mokhtarzada (CRO), and Idris Mokhtarzada (CTO), and the story about they got the idea for the company is amazing:

The idea didn't come to us right away. After our first company webs.com had sold, we wanted to get the band together. We sat around for weeks trying to think of our next company idea. We had all sorts of thoughts. A Virtual Reality product? A child safe router? And so much more.

While we were thinking, two of us separately noticed odd things on our credit cards. We found an in-flight wifi charge, originally intended to be a single purchase, that was actually charging us for 14 months. Apparently it was a subscription. We also saw we'd been paying for a home security subscription on an old house long after moving out. So, we put the idea of a subscription cancellation platform on the white board and the rest was history.

We didn't think it would be a big company. We just knew we needed to start building something. So we asked Idris, who has always been our chief technology officer, if he could create an algorithm that would find subscriptions.

When we sent it to friends, two things happened. First everyone was finding recurring charges they didn't know about. And second, fifteen to thirty people were signing up each day organically, so we could tell right away there was a huge need for this program.

We were collectively struck by people's financial inefficiencies and knew this had to be a thing. We applied on a whim to Y Combinator and much to our surprise, we were able to get funding right away.

So, they got into Y-Combinator and they raised a total of $83.9 Million in outside capital, before the acquisition by Rocket Companies by $1,275 Million in cash.

Let’s dissect the role we are discussing today and discuss some ideas on how to approach this job application (THE REAL MEAT)

Download the app ASAP

My first advice is always the same:

Please use the product. Download the app, and make familiar with it.

Even if you don’t get the role, you will love this product, and it could help you to find those obscure subscriptions you didn’t know that are out there.

Now, let’s review the job description closely and discuss some key points here:

Truebill relies on machine learning in a number of critical path systems including transaction classification, customer lifetime value estimation, and cash flow prediction.

Machine Learning is a key component of Truebill’s underlying technology, so you have to be a real Machine Learning Practitioner here, especially with actual experience deploying models to production.

This is a hybrid managerial role that is a mix of technical and people leadership where you'll get a chance to write code and know the inner operating details of the system, while also building out the operating system to enable the team to scale.

Again: this is a practitioner role. You have to actually write code here, not only manage the team.

Design and lead the construction of robust machine learning patterns and systems that enable Truebill to scale to millions (billions?!) of predictions per day

This part is amazing and the same time very challenging, because with the growth of Truebill, I truly believe they can reach this scale very quickly.

You have demonstrated practical experience running multiple ML projects in a production environment that are critical paths where inference time and reliability are key operating metrics

Ferocious documenter (we're an async organization that works remotely and documentation is essential to our success). Bonus points if you fix Notion search someday

In today’s environment where many people are working in hybrid mode (some days in the office, some days from home), being a clear communicator is key, and being a clear writer is even better, especially with this kind of system that is very complex.

Bonus points for experience with low latency streaming solutions on modern cloud workflow pipelines

This is a very interesting point as well: for example if you use Google Cloud Platform for model building, you can use work with incredible things like Cloud Dataflow (based in Apache Beam) for batch and streaming analytics efforts or Vertex AI, or in the case of AWS, you can use Sagemaker and the whole ecosystem behind it.

Two amazing resources I can recommend here are these two amazing books:

Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps, by Valliappa (Lak) Lakshmanan, Sara Robinson and Michael Munn, focused on GCP-based solutions
Data Science on AWS: Implementing End-to-End, Continuous AI and Machine Learning Pipelines, by Chris Fregly and Antje Barth focused on AWS-based solutions

Chat with people that “has walked the talk”

The second piece of advice I can give you here is that you should reach out to people that have built these systems from scratch in heavily Data-Driven companies like Stitch Fix, Netflix, Facebook, Amazon, PayPal, Cash App, etc.

I will let you 3 potential candidates to invite for a coffee:

Stefan Krawczyk (Mgr. & Lead ML Platform/Data Platform at Stitch Fix )
Moein Saleh (Senior Manager, Machine learning at PayPal)
Juan Hernandez (Machine Learning Engineering Manager (Cash App)

The whole idea here is to discuss some ideas, approaches, and best practices to build large-scale Machine Learning based systems. People love to chat about this, so be a sponge here.

Chat with people from the company

This is always welcomed: This proves that you are actually interested to be part of the company, and if you come up with good ideas, even before taking the role, even better.

I will let here two Eng Managers:

Phil Boothe
Patrick Carroll

Take all this and make a killing in your application.

Good luck.

Other featured jobs from the Interesting Data Gigs Job Board

People to follow: Mikkel Dengsøe (Monzo Bank) and Alejandro Saucedo (Seldon)

Why should you follow these guys? Let me explain why.

In the case of Mikkel, he is working right now as the Head of Data Science, Operations & Financial Crime at Monzo Bank.

What does means to you? It’s very simple: the wisdom and knowledge about managing Data teams he is sharing on LinkedIn and his newsletter on Substack are simply amazing.

Two of my favorite posts from Mikkel are these ones: the one where he explained Data salaries at FAANG companies; comparing the U.S with Europe-based salaries, and this one explaining the Data team structure.

Subscribe to his newsletter. Believe me: you will grow as a Data people manager.

In the case of Alejandro: if you are a Machine Learning practitioner and you are not subscribed to his awesome newsletter, stop what are you doing here, and please subscribe to it. If you prefer LinkedIn, the link to the newsletter is this one.

In every edition, he curates a lot of great resources related to Machine Learning, so from my perspective, this is a unique and amazing way to follow the trends in the field, and keep your knowledge fresh on it.

For example, in the last edition ( No. 184), he shared the amazing new Machine Learning course from Andrew Ng on Coursera, an incredible MLOps course that is completely free, and many more cool things.

Seriously: you need to follow his work right now.

Interesting Open Source projects related to Data

Modin: Is a drop-in replacement for pandas. While pandas is single-threaded, Modin lets you instantly speed up your workflows by scaling pandas so it uses all of your cores. Modin works especially well on larger datasets, where pandas become painfully slow or run out of memory. You can download it here: and if you have any questions, chat with Doris Lee, Aditya Parameswaran, and Devin Petersohn from Ponder
data-diff: is a command-line tool and Python library to efficiently diff rows across two different databases.- Verifies across many different databases (e.g. PostgreSQL - > Snowflake)- Outputs diff of rows in detail- Simple CLI/API to create monitoring and alerts- Bridges column types of different formats and levels of precision (e.g. Double ⇆ Float ⇆ Decimal)- Verify 25M+ rows in < 10s, and 1B+ rows in ~5min.- Works for tables with 10s of billions of rowsIf you have any questions about data-diff, chat with Ilia Pinchuk & Clay Moeller from Datafold
Bill - the optimization cost bot, created by Christian Bonzelet
hamilton: A scalable general-purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc. If you have any questions, chat with Stefan Krawczyk

Interesting resources of the week

The incredible Open Source MLOps course shared by Alejandro is too good to not share here
The Day 1 ship at Watershed, by Jessica Zhu
How to integrate Databricks with Redpanda, by Osinachi Chukwujama
Monte Carlo Announces Delta Lake, Unity Catalog Integrations To Bring End-to-End Data Observability to Databricks, by Prateek Chawla
Evaluating Graviton 2 for data-intensive applications: An Arm vs Intel comparison, by Travis Downs
How Netflix Content Engineering makes a federated graph searchable (part I), Part II, by Alex Hutter, Falguni Jhaveri, and Senthil Sayeebaba
We Put Half a Million files in One git Repository, Here’s What We Learned, by Anh Le
Applying federated learning to protect data on mobile devices, by Meta Engineering
Griffin: How Instacart’s ML Platform Tripled ML Applications in a year, by Sahil Khanna
Debugging Ad Delivery At Pinterest, by Nishant Roy

If you’re finding this newsletter valuable, consider sharing it with friends, or subscribing if you haven’t already.

Join the conversation

or to participate.