Full Stack Developer (Java)

Full-Time Part-Time Remote

Develop and support scalable, extensible, and highly available data pipelines on heterogeneous datasets that power downstream applications and systems and serve content to our web and API products.
Closely collaborate with partners across product and design, engineering, and business teams to drive innovations that improve our customers’ experience.
Follow software-development best practices including test-driven development, contributing to documentation, feature flagging, etc.
Help to maintain and improve existing ETL pipelines.
Work with your team to troubleshoot and fix issues in ingest and processing, considering dependencies and integration points.
Collaborate with DevOps to plan resources and continuously optimize the infrastructure and configuration of our data pipelines to ensure a healthy and high-performance production deployments.

Strong expertise in advanced data modeling, schema and ETL process design, implementation, and maintenance.
Experience with numerous data lake / warehouse technologies (Databricks, Snowflake, Presto, Dremio).
Strong knowledge of data lake best practices.
Strong background in advanced SQL and focused on understanding, manipulating, processing and extracting values from large datasets and data streams.
Cloud experience (Azure or AWS preferred).
Advanced Python skills or Java skills.
Experience with different variety of data types (JSON, Parquet, Excel, Flat files).
Experience with databases and data storage frameworks including: Microsoft SQL, Postgresql, Elasticsearch, Mongo, Cosmos DB, Delta Lake.
Expertise in cloud messaging platforms including Apache Kafka or Azure Service Bus.
Solid comprehension of common design patterns, algorithms, and data structures.
Working knowledge of containerization and modern cloud deployments including Docker and Kubernetes.
Bachelor’s degree in computer science, or a related field.
Excellent communication and presentation skills.

Python Developer

Full-Time Part-Time Remote

Build and deploy production-grade AI/ML pipelines using Python — from data ingestion to model serving via REST APIs.
Develop LLM-powered applications and GenAI features: RAG pipelines, prompt engineering, tool-use agents and fine-tuning workflows.
Build scalable backend APIs using FastAPI or Django REST Framework, and deploy on AWS (EC2, Lambda, ECS).
Integrate with vector databases (Pinecone, Weaviate, ChromaDB) and LLM providers (OpenAI, Anthropic, Hugging Face).
Collaborate closely with the AI/ML team to bring models from research into reliable, maintainable production systems.

Full-Time Part-Time Remote

Develop, deploy, and support real-time, automated, scalable data streams from a variety of sources into the data lake or data warehouse.
Develop and implement data auditing strategies and processes to ensure data quality; identify and resolve problems associated with large scale data processing workflows; implement technical solutions to maintain data pipeline processes and troubleshoot failures.
Collaborate with technology teams and partners to specify data requirements and provide access to data.
Tune application and query performance using profiling tools and SQL or other relevant query language.
Understand business, operations, and analytics requirements for data.
Build data expertise and own data quality for assigned areas of ownership.
Work with data infrastructure to triage issues and drive to resolution.

Bachelor’s Degree in Data Science, Data Analytics, Information Management, Computer Science, Information Technology, related field, or equivalent professional experience.
Overall experience should be more than 7+ years.
3+ years of experience working with SQL.
3+ years of experience in implementing modern data architecture-based data warehouses.
2+ years of experience working with data warehouses such as Redshift, BigQuery, or Snowflake and understand data architecture design.
Excellent software engineering and scripting knowledge.
Strong communication skills (both in presentation and comprehension) along with the aptitude for thought leadership in data management and analytics.
Expertise with data systems working with massive data sets from various data sources.
Ability to lead a team of Data Engineers.

Experience working with time series databases.
Advanced knowledge of SQL, including the ability to write stored procedures, triggers, analytic/windowing functions, and tuning.
Advanced knowledge of Snowflake, including the ability to write and orchestrate streams and tasks.
Background in Big Data, non-relational databases, Machine Learning and Data Mining.
Experience with cloud-based technologies including SNS, SQS, SES, S3, Lambda, and Glue.
Experience with modern data platforms like Redshift, Cassandra, DynamoDB, Apache Airflow, Spark, or ElasticSearch.
Expertise in Data Quality and Data Governance.