What's your preference?
Job Description
- Req#: 25-15207
- Integrate and fine-tune pre-trained LLMs into our codebase using APIs, frameworks, and orchestration tools to enable features like natural language querying, automated summarization, and intelligent anomaly detection in data streams.
- Design, build, and optimize chatbot systems and conversational AI, incorporating LLMs for seamless user experiences, including multi-turn dialogues, context-aware responses, and integration with external data sources.
- Implement RAG architectures to enhance LLM performance by combining retrieval from vector databases with generation, enabling accurate responses grounded in large-scale document corpora.
- Apply advanced LLM techniques—such as prompt engineering, chain-of-thought prompting, retrieval-augmented generation (RAG), and agentic workflows—to solve general problems like automating workflows, debugging data pipelines, or generating insights from unstructured inputs.
- Work with open-source LLMs (e.g., Gemma, Llama) for local deployment and inference, optimizing for on-premises or edge environments to ensure low-latency performance and data sovereignty.
- Incorporate computer vision tasks, such as OCR for text extraction from images or documents, and broader CV techniques for processing visual data in hybrid LLM pipelines.
- Experiment iteratively with LLM configurations, hyperparameters, and embeddings to boost performance metrics like accuracy, latency, and cost-efficiency in real-world scenarios, including vector search optimizations.
- Maintain scalable codebase integrations, conduct thorough testing (e.g., unit tests for LLM outputs, A/B evaluations), and ensure compliance with AI best practices, including bias mitigation and data security.
- Collaborate with cross-functional teams on code reviews, rapid prototyping, and knowledge sharing, while leveraging AI coding assistants to accelerate development.
- Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field.
- 3+ years of professional experience in AI engineering, specifically manipulating and deploying LLMs in production (e.g., via Hugging Face Transformers, LangChain, LlamaIndex, or OpenAI/Groq APIs), including hands-on work with open-source models like Gemma and Llama for local deployment (e.g., using Ollama, vLLM, or direct PyTorch inference setups).
- Advanced proficiency in Python, including scripting for LLM pipelines, handling dependencies with tools like Poetry or Pipenv, and integrating with libraries such as Sentence Transformers for embeddings, FAISS for vector search, or Streamlit/Gradio for prototyping interfaces.
- Experience with vector databases and semantic search (e.g., Pinecone, Weaviate, or FAISS) to support efficient retrieval in LLM applications.
- Demonstrated expertise in RAG systems, from building retrieval components to integrating them with LLMs for enhanced reasoning and factuality.
- Proven track record building and optimizing chatbots or conversational agents (e.g., using Rasa, Dialogflow, or custom LLM-based setups), with examples of deploying them in user-facing applications.
- Strong general problem-solving abilities, demonstrated through projects involving breaking down ambiguous tasks into structured AI solutions, debugging LLM hallucinations, or optimizing for edge cases.
- Hands-on experience with AI-powered coding assistants (e.g., Cursor, GitHub Copilot, or similar tools) to enhance productivity in LLM engineering workflows.
- Experience with Google Cloud Platform (GCP) for deploying and managing AI workloads, including services like Vertex AI, Cloud Run, or AI Platform.
- Familiarity with vision tasks, including OCR (e.g., via Tesseract or EasyOCR) and computer vision libraries (e.g., OpenCV) for processing images, PDFs, or multimodal data in conjunction with LLMs.
- Excellent communication and teamwork skills for thriving in collaborative, iterative environments.
- Expertise in complementary LLM ecosystem tools, such as AutoGen for multi-agent systems, Haystack for RAG pipelines, or evaluation frameworks like Rouge/BLEU for assessing outputs.
- Familiarity with data-intensive applications, including processing unstructured text, integrating with databases (e.g., Pinecone, Weaviate), or handling multimodal inputs like images for OCR-enhanced workflows.
- Experience with other cloud platforms (e.g., AWS SageMaker, Azure ML) alongside GCP, including containerization with Docker and orchestration via Kubernetes.
- Contributions to open-source LLM projects, hackathons, or publications showcasing innovative uses of models like GPT-series, Llama, or Mistral.
- Knowledge of ethical AI practices, such as implementing guardrails for safe LLM interactions or conducting fairness audits.
- Competitive salary and benefits package, including health insurance, retirement plans, and professional development opportunities.
- A collaborative, innovative work environment in Houston with flexible hybrid options.
- The chance to work on impactful AI projects that directly influence industries.
- Opportunities for growth in a seed-stage company backed by strong funding and a visionary team.
Role: AI Engineer
Location: Houston, TX 77007
Duration: Direct Hire
Work Authorization: US Citizen, Green Card Holders, or Authorized to Work in the US
Job Description:
We are seeking a talented AI Engineer with deep expertise in Large Language Model (LLM) engineering and design. The ideal candidate will be fluent in manipulating and integrating pre-trained LLMs within complex codebases to tackle practical challenges in data extraction, processing, and interactive systems. This role prioritizes hands-on application—such as customizing LLMs for specific tasks, including retrieval-augmented generation (RAG) pipelines and vision-based workflows—over building models from scratch. You'll focus on leveraging LLMs for solutions like advanced chatbots, natural language interfaces, semantic search, and creative problem-solving across data-intensive scenarios, while collaborating in a fast-paced team to push our AI products forward.
Key Responsibilities
Required Qualifications
Preferred Qualifications
What We Offer
About INSPYR Solutions
Technology is our focus and quality is our commitment. As a national expert in delivering flexible technology and talent solutions, we strategically align industry and technical expertise with our clients' business objectives and cultural needs. Our solutions are tailored to each client and include a wide variety of professional services, project, and talent solutions. By always striving for excellence and focusing on the human aspect of our business, we work seamlessly with our talent and clients to match the right solutions to the right opportunities. Learn more about us at inspyrsolutions.com.
INSPYR Solutions provides Equal Employment Opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, or genetics. In addition to federal law requirements, INSPYR Solutions complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities
Information collected and processed through your application with INSPYR Solutions (including any job applications you choose to submit) is subject to INSPYR Solutions’ Privacy Policy and INSPYR Solutions’ AI and Automated Employment Decision Tool Policy: https://www.inspyrsolutions.com/policies/. By submitting an application, you are consenting to being contacted by INSPYR Solutions through phone, email, or text.
About the company
Genuent provides an innovative approach to the Delivery of Information Technology Talent. Genuent is becoming INSPYR Solutions.
Notice
Talentify is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status.
Talentify provides reasonable accommodations to qualified applicants with disabilities, including disabled veterans. Request assistance at accessibility@talentify.io or 407-000-0000.
Federal law requires every new hire to complete Form I-9 and present proof of identity and U.S. work eligibility.
An Automated Employment Decision Tool (AEDT) will score your job-related skills and responses. Bias-audit & data-use details: www.talentify.io/bias-audit-report. NYC applicants may request an alternative process or accommodation at aedt@talentify.io or 407-000-0000.