Kyutai the AI research lab behind Moshi AI is hiring
Moshi, a voice-activated conversational AI platform is seen as a promising sign of Europe’s potential in the global AI landscape. We take a look at what the platform demo can do.
If you’re the right fit and happy to move to Paris, download a French language app and get your portfolio site ready. They’re hiring.
FYI: The founding Kyutai team is 100% blokey. It could do with an injection of diversity!
What is Kyutai?
Kyutai, launched in November 2023 with reportedly €300 million ($324 million) in funding, boasts prominent backers including French billionaires Xavier Niel and Rodolphe Saadé, as well as former Google chairman Eric Schmidt. Helmed by Patrick Pérez, former director for Valeo SA, the lab has recruited researchers from Google DeepMind and Meta Platforms Inc.
Kyutai is a non-profit laboratory dedicated to open research in AI founded by the iliad Group, CMA CGM and Schmidt Sciences. Launched with just a team of six scientists, who have all worked with Big Tech labs in the US, Kyutai continues to recruit at the highest level.
The latest recruitment news is that Kyutai is growing and hiring research scientists and engineers for post-docs and full-time technical positions. However they “are no long actively looking for interns as our next internships will only start early 2025.”
All positions are based in Paris, France. And their recruitment seems to be going through Homerun’s Hiring Tool
What’s next for Kyutai?
Now comprising a dozen members, the team will launch its first PhD thesis at the end of the year. The lab is currently working in particular on multimodality, i.e., the possibility for a model to exploit different types of content (text, sound, images, etc.) both for learning and for inference.
According to Kyutai, all the models developed are intended to be freely shared, as are the software and know-how that enabled their creation. To carry out its work and train its models, Kyutai relies in particular for its ability to compute on the scalable AI infrastructure platform, the Nabu 23 superpod, made available by Scaleway, a subsidiary of the iliad Group
What can Moshi AI demo platform do?
Moshi, an experimental conversational AI platform developed by Kyutai. What we know about it so far is it is designed to be more natural and engaging than other AI assistants, and it can understand the tone of your voice and add a layer of emotional intelligence to interactions or conversations.
Moshi is still under development, but it is already able to hold conversations on a variety of topics. It can also perform small talk, explain various concepts, and engage in roleplay in many emotions and speaking styles.
Moshi is not an assistant, but rather a prototype for advancing real-time interaction with machines. It can chit-chat, discuss facts and make recommendations, but a more groundbreaking ability is its expressivity and spontaneity that allow for engaging into fun roleplay.
Developing Moshi required significant contributions to audio codecs, multimodal LLMs, multimodal instruction tuning and much more. The main impact of the project is sharing all Moshi’s secrets with the upcoming paper and open-source of the model.
Moshi reviews
Here is what tech commentator Johan Sanneblad, Director of Human-centred AI at Rise Research Institutes of Sweden, had to say after he gave the platform a test talk:
The quality of their language model Helium-7B, in particular responses to the questions they asked, was quite bad even for a 7B model. When asked “What kind of gear do you need to bring for mountain climbing?” the model responded with: “You might want to take your time getting your climbing shoes on, because you don’t want to be using a egg.”
These weird types of answers just kept showing even in this controlled demo. I also thought the way the AI constantly interrupted the speakers in the keynote was annoying, but I can see why they did it for demonstrational purposes. It’s well worth your time checking their presentation if you have time for it, it was the first time I have heard an English text-to-speech engine speak with a clear French accent.
Experiment with the Moshi demo
YouTube channel World of AI has also done a test run on the demo and given some feedback and tips on how to sign up. Granted, the prompts are a bit silly (even tests if Moshi can have a conversation in a whispering voice), but why not have fun with it? It’s not like you can just freely chat without being heard in the office talking to Moshi, so have fun with it at home or when the fun bunch in the office go out for a coffee or lunch break.
For now, you can experiment with Moshi with the online demo. You can use it for fun or for work. However, before you do, read the terms and conditions as you will likely be using your voice and there could be some reservations with that if you start to ask questions or accidentally use personal information: moshi-terms.pdf (kyutai.org)