Research Software Engineer (f/m/d)
Aleph Alpha
Overview:
Aleph Alpha Research’s mission is to deliver category-defining AI innovation that enables open, accessible, and trustworthy deployment of GenAI in industrial applications. Our organization develops foundational models and next-generation methods that make it easy and affordable for Aleph Alpha’s customers to increase productivity in development, engineering, logistics, and manufacturing processes.
We are hiring to grow our org in Heidelberg, Germany, and are looking for well-rounded, experienced Research Software Engineers with experience in DevOps/MLOps.
As a Research Software Engineer in Aleph Alpha Research, you help the research teams take model and algorithm development to the next level. You own significant portions of the research infrastructure, including the pipelines related to data processing, our testing infrastructure, and engineering-heavy parts of our distributed training software. You will also have the chance to contribute your software engineering experience to research projects in areas such as tokenization, agent interfaces, and data generation, and thus have a significant influence on our ability to deliver novel category-defining AI capabilities.
You devise and implement robust and maintainable complex systems that make POCs, ablation studies, and new algorithmic capabilities, as well as their transition into production a great experience for research and product development teams alike. You likewise co-own efforts that aim to make parts of our code source available to the broader research community.
Your responsibilities:
Depending on your profile, you will contribute to one or more of the following areas:
Design and (continuous) development of the research infrastructure, establish mechanisms that improve code quality, testing, and feature delivery
Support the development, training, and maintenance of deep learning models, in collaboration with the researchers as well as the SW/HW engineers at our distributed computation centers
Developing and optimizing lower-level code for data processing, tokenization, or research projects
Contributing your software-engineering expertise to research projects (this could be, for example, in areas such as agent interfaces or data generation)
Help production AI research innovations into real-world applications
Engaging in our hiring process and otherwise mentoring engineers and researchers in terms of software development best practices
Most of our training code is written in Python, with PyTorch being our main deep learning framework. Some of our lower-level code is written in Rust.
Your profile:
Basic Qualifications
3+ years of non-internship professional software development experience
2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Experience programming with at least one software programming language
Ready to relocate to Heidelberg, Germany
Bachelor's degree in computer science or equivalent
Preferred Qualifications
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience, as well as architecting (design patterns, reliability and scaling) of new and existing systems
Track record in building complex software systems (with or without machine-learning components), e.g., via open-source project contributions
Experience with systems programming and low-level code, e.g. with Rust
Strong sense of ownership and customer-centricity. Passionate about building quality software and improving operational excellence, and demonstrated ability to achieve stretch goals in a highly innovative and fast-paced team environment
Master's degree in computer science or related field
We do not require prior experience in machine learning for this role, but we do value your eagerness to learn. If you have prior experience in ML, we will be particularly excited about:
Experience in the productization of AI research innovations into real-world applications, ideally with a focus on large-scale data processing and distributed computation for foundational model training or inference
Familiarity with popular NLP tools and frameworks such as PyTorch or HF transformers, knowledge of transformer architectures
What you can expect from us:
Become part of an AI revolution
30 Days of paid vacation
Flexible working hours
Join a dynamic start-up and a rapidly growing team
Work with international industry and science experts
Take on responsibility and shape our company and technology
Regular team events