Inria

Post-Doctoral Research Visit F/M Distributed Machine Learning at the Network Edge

Click Here to Apply

Job Location

Antibes, France

Job Description

Type de contrat : CDD

Niveau de diplôme exigé : Thèse ou équivalent

Fonction : Post-Doctorant

A propos du centre ou de la direction fonctionnelle

The Inria centre at Université Côte d'Azur includes 37 research teams and 8 support services. The centre's staff (about 500 people) is made up of scientists of different nationalities, engineers, technicians and administrative staff. The teams are mainly located on the university campuses of Sophia Antipolis and Nice as well as Montpellier, in close collaboration with research and higher education laboratories and establishments (Université Côte d'Azur, CNRS, INRAE, INSERM ...), but also with the regional economic players.

With a presence in the fields of computational neuroscience and biology, data science and modeling, software engineering and certification, as well as collaborative robotics, the Inria Centre at Université Côte d'Azur is a major player in terms of scientific excellence through its results and collaborations at both European and international levels.

Contexte et atouts du poste

The position is in the framework of dAIEDGE-A network of excellence for distributed, trustworthy, efficient and scalable AI at the Edge-funded by the European Union.

The vision of the dAIEDGE Network of Excellence is to strengthen and support the development of the dynamic European edge AI ecosystem under the umbrella of the European AI Lighthouse and to sustain the advanced research and innovation of distributed AI at the edge as essential digital, enabling, and emerging technology in an extensive range of industrial sectors.

The candidate will work with NEO Inria team (https://team.inria.fr/neo/) and COATI Inria team (https://team.inria.fr/coati/), and in particular with:

Giovanni Neglia

Chuan Xu

Frédéric Giroire

Mission confiée

The Internet was conceived to enable computer resources’ time-sharing, but soon its main function became to deliver content to end users, but it is now called to play a new key role: to pervasively support machine learning (ML) operation both for model training and prediction serving.

There are two aspects calling for Internet-wide deployment of ML systems. First, data-one key ingredient of ML success-is often generated by users and devices at the edge of the network. The classic ML operation in the cloud requires such data to be collected at a single computing facility where training occurs. Data aggregation can be very costly, or simply impossible because of capacity constraints, privacy issues, or ownership ones. These scenarios call for distributed learning systems, where computation moves, at least in part, to the data. For example, Google’s federated learning enables mobile phones, or other devices with limited computing capabilities, to collaboratively learn an ML model while keeping all training data locally. Distributed ML training is already a difficult task in a cluster setting. Indeed, optimization techniques, distributed systems, and ML models are a triad difficult to untangle: e.g., relaxed state consistency across computing nodes increases system throughput but may jeopardize convergence of the optimization algorithm or affect the final solution selected, leading to models with very different generalization capabilities. Additional challenges arise when training moves to the Internet. First, the system potentially scales up to billions of devices, against at most thousands of GPUs to break ML training records in a cluster. Second, local datasets are highly heterogeneous with very different sizes and feature/label distributions. Third, devices may have very different hardware and connectivity. Fourth, communications are often unreliable (devices can be switched off at any time), slow (latencies are 2 orders of magnitude larger), and expensive for battery-constrained devices. Fifth, privacy concerns are often important and limit the operations that can be performed during training to avoid inadvertently disclosing sensible information. Finally, training is more vulnerable to malicious attacks. For all these reasons, federated learning (as ML training over the Internet is now usually called) has emerged in the last years as a specific research topic—well distinct for example from high-performance computing or cloud computing—at the intersection of machine learning, optimization, distributed systems, and networking.

The second driver to distribute ML processes over the Internet is real-time inference. In fact, ML models are often trained for inference’s purposes, i.e., to make predictions on new data. Model predictions need then to be served to the final users. ML training is a computationally expensive operation and is the object of much research effort. Inference does not involve complex iterative algorithms and is therefore generally assumed to be easy, but it also presents fundamental challenges that are likely to become dominant as ML adoption increases. AI systems will be ubiquitously deployed and will need to make timely and safe decisions in unpredictable environments. In this case, inference must run in real-time, and predictions may need to be served at a very high rate. The big cloud players—Amazon, Microsoft, and Google—have all started pushing their “machine learning as a service” (MLaaS) solutions. Running the models in the cloud guarantees high scalability, but may fail to meet delay constraints. As an example, already deployed applications, such as recommendation systems, voice assistants, and ad-targeting, need to serve predictions from ML models in less than 20 ms. Future wireless services, such as connected and autonomous cars, industrial robotics, mobile gaming, augmented/virtual reality, have even stricter latency requirements, often below 10 ms and below 1 ms for the so-called tactile internet. It is then imperative to run these services closer to the user at the network edge. 5G deployment can provide computing and storage capabilities at the edge, but those will still be very limited in comparison to the cloud and need to be wisely used. In conclusion, inference will require complex resource orchestration across users’ devices, edge computing servers, and the cloud.

We are looking for a postdoc candidate who could join our team to work on one or more of the following topics (for which we provide pointers to our publications):

Distributed Inference

Online Learning Algorithms with Regret Guarantees

Distributed/Federated Learning

Machine Learning Privacy

We expect the postdoc to actively participate in the activities of the EU project dAIEDGE (e.g., attending meetings, coordinating Inria contribution to deliverables). The postdoc will also have the opportunity to collaborate with PhD students working on the topics listed above.

Principales activités

Beside carrying out high quality research, we expect the postdoc to actively participate to the activities of the EU project dAIEDGE (e.g., attending meetings, coordinating Inria contribution to deliverables).

The postdoc will also have the opportunity to collaborate with PhD students working on the topics listed above.

Compétences

Candidates must hold a Ph.D. in Applied Mathematics, Computer Science or a closely related discipline. Candidates must also show evidence of research productivity (e.g. papers, patents, presentations, etc.) at the highest level.

We prefer candidates who have strong mathematical background (on optimization, statistical learning or privacy) and in general are keen on using mathematics to model real problems and get insights. The candidate should also be knowledgeable on machine learning and have good programming skills. Previous experiences with PyTorch or TensorFlow is a plus.

The position is for 18 months, but it can be extended up to 30 months.

Avantages

Subsidized meals

Partial reimbursement of public transport costs

Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)

Possibility of teleworking and flexible organization of working hours

Professional equipment available (videoconferencing, loan of computer equipment, etc.)

Social, cultural and sports events and activities

Access to vocational training

Contribution to mutual insurance (subject to conditions)

Rémunération

Gross Salary: 2746 € per month

Informations générales

Thème/Domaine : Optimisation, apprentissage et méthodes statistiques
Système & réseaux (BAP E)

Ville : Sophia Antipolis

Centre Inria : Centre Inria d'Université Côte d'Azur

Date de prise de fonction souhaitée : 2023-09-01

Durée de contrat : 1 an, 4 mois

Date limite pour postuler : 2024-12-31

Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.

Consignes pour postuler

Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.

Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.

Contacts

Équipe Inria : NEO

Recruteur :
Neglia Giovanni / Giovanni.Neglia@inria.fr

A propos d'Inria

Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eﬀorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.

#J-18808-Ljbffr

Location: Antibes, FR

Posted Date: 11/26/2024

Click Here to Apply

View More Inria Jobs

Contact Information

Contact	Human Resources Inria