Diverse Lynx

Azure GenAI Data Scientist

Employment typeFull-Time

  • Job Description

      Req#: 24-03389
      • Design, develop, deploy and improve production-grade near realtime scalable machine learning and statistical predictive models and NLP from near realtime call transcripts and utterances
      • Develop Client algorithms and models using state-of-the-art techniques in NLP model architectures and algorithms such as BERT (and derivatives like BioBERT, RoBERTa, ALBERT etc.), BiLSTM, XLNet, T5, ELECTRA, PaLM and more
      • Partner with cross-functional teams, understand problems, and identify opportunities where advanced analytics and machine learning techniques can be used to make a significant impact and then design, develop, deploy and monitor those Client solutions


      • Capture and inform your Client infrastructure decisions using your understanding of Client modeling techniques and issues, including choice of model, data, and feature selection, model training, hyperparameter tuning, dimensionality, bias/variance, and validation).
      • Design, implement, deploy, and maintain deep learning and Client models using cloud technologies (e.g., Azure Databricks, MLFlows and Azure Client)
      • Write production-ready modeling code that can be scaled out to 100 millions of calls and millions of users


      • Promote deep scientific expertise, constant learning, attention to detail, and best practices while always being friendly, humble, and open to challenging any assumptions
      • Collaborate with data engineers, machine learning engineers, product managers and capability teams to coordinate timely deployments from conception to release
      • Promotes and integrates best practices in data science and adheres to established work standards


      • Research new machine learning solutions to complex business problems
      • Communicate process, requirements, assumptions and caveats of advanced Client and NLP concepts and deliverables in laymen languages to non-technical business leaders


      • BS, MS, or PhD in Computer Science, Statistics, Applied Mathematics, Data Science, Economics or related quantitative fields
      • 5+ years experience in designing, developing and deploying production-grade machine learning solutions in NLP (NLTK, Spark NLP, spaCy, HuggingFace, Flair, NLTK, etc) for real-world business problems
      • Worked in NLP model architectures and algorithms such as BERT (and derivatives like BioBERT, RoBERTa, ALBERT etc.), BiLSTM, XLNet, T5, ELECTRA, PaLM
      • Experience in LLMs/Open Source LLMs (like ChatGPT, LLama, Falcon, Vicuna, Bard, etc.) and Langchain frameworks
      • Development and Client experience in Microsoft Azure platform
      • Experience in deep learning neural networks (auto-encoders, feedforward networks, RNNs/CNNs, etc.)
      • Expertise in Python and SQL, with working experience in Apache Spark, Hadoop, Databricks, Snowflake, or other big data systems is preferred
      • Combination of deep technical skills and business sense, to interface with all levels and disciplines within an organization.
      • Excellent written and verbal communication skills to explain complex research to both technical and non-technical audiences
      Self-motivated individual that thrives in a dynamic environment

