Skip to content

AI/ML Engineer / Python Developer with LLM exp - Charlotte, NC (Hybrid),

  • Hybrid
    • Charlotte, North Carolina, United States
  • $65 - $70 per hour

Job description

Charlotte, NC (Hybrid – 3 days onsite / 2 days remote)
Local W2 candidates only

We are seeking a highly skilled and motivated AI/ML Engineer / Python Developer with proven experience in Large Language Models (LLMs), GPU-based computing, and cloud-native architecture on GCP. The ideal candidate will possess a strong background in API development, distributed computing, and deploying AI/ML solutions at scale using modern tools and frameworks.

  • Design, develop, and deploy AI/ML models with a focus on generative AI using frameworks like LLaMA and Mistral.

  • Optimize model training and inference on GPU clusters, using multi-GPU training techniques.

  • Develop and expose RESTful APIs using FastAPI, Unicorn, and Swagger for AI model integration.

  • Architect and maintain cloud-native applications on Google Cloud Platform (GCP), including use of TPUs and GPU instances.

  • Build and scale data pipelines with Apache Kafka for real-time data streaming and use Apache Spark (PySpark) for distributed data processing.

  • Implement and support Kubernetes-based infrastructure for scalable model training and deployment.

  • Develop back-end services and APIs using Python and Django.

  • Configure and manage NVIDIA GPU drivers and environments for high-performance computing.

  • Work collaboratively in a fast-paced, hybrid work environment and contribute to continuous integration and deployment practices.

Job requirements

  • 7–10 years of total experience in software engineering and AI/ML.

    1. Hands-on experience with Python, FastAPI, Unicorn, Swagger, Django.

    2. Strong understanding of Generative AI frameworks (e.g., LLaMA, Mistral).

    3. Deep expertise in GPU-accelerated training, with proficiency in TensorFlow Distributed, PyTorch Distributed, and Horovod.

    4. Proficiency in Apache Kafka, Apache Spark (PySpark), and Kubernetes.

    5. Demonstrated experience with GCP services, particularly with TPUs and GPU-enabled compute instances.

    6. Experience in building and deploying scalable cloud-native architectures and microservices.

    7. Excellent problem-solving, communication, and teamwork skills.

    • Contributions to open-source LLM projects or experience training LLMs from scratch.

    • Familiarity with MLOps practices and frameworks.

    • TopTech Talent is proud to be an equal opportunity workplace and is an affirmative action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, age, national origin, citizenship status, disability, protected veteran status, gender identity or any other factor protected by applicable federal, state, or local laws.

    🚫 Third-party recruiters, please do not reach out for this role.

or