About
Cyndx is an Artificial Intelligence and Natural Language Processing (NLP) platform that offers 'search and discovery' solutions for entrepreneurs, start-ups, investors, and acquirers. Our subscription-based solution helps enhance capital raising, acquisitions, and other business opportunities. Our platform hosts data on over 20 million companies world-wide and is used by some of the largest financial institutions in the world.
As a Data Scientist at Cyndx, you will join our AI team to develop and maintain machine learning models that power our financial intelligence platform. You'll work across our technology stack, from maintaining our 'Projected To Raise' forecasting model to implementing new features in our FastAPI middleware service. This role offers an exceptional opportunity for early-career data scientists to gain hands-on experience in applying cutting-edge AI techniques to solve real-world financial problems, while working with senior developers to enhance your skills in machine learning operations, data engineering, and software development.
This role will be located in either our New York or out West Palm Beach office. Please note that we are currently working on a hybrid model and are in the office for four days and remote for one day each week. Remote work is not a possibility in this role.
Learn More
Want to learn more about Cyndx? Read some of our recent press coverage:
Responsibilities
- Develop, maintain, and enhance our models using time series analysis, machine learning, and deep learning techniques.
- Implement and fine-tune open-source LLMs for specialized financial text generation, summarization, and classification tasks.
- Build, deploy, and monitor agentic AI workflows that interact with our financial data ecosystem.
- Design and develop data engineering pipelines to ingest, transform, and integrate new financial data sources.
- Create and maintain features in our FastAPI middleware service, including developing new endpoints and optimizing existing ones.
- Collaborate with cross-functional teams to identify opportunities for AI/ML solutions to improve our product offerings.
- Write clean, maintainable, and well-documented code following team standards and best practices. Participate in code reviews, testing, and deployment processes using modern CI/CD practices. Assist in troubleshooting production issues and performance optimization.
Qualifications
- 1-2 years of professional experience in data science, machine learning, or related technical field. Bachelor's or Master's degree in Computer Science, Statistics, Mathematics, Engineering, or related STEM field.
- Strong foundation in statistics, probability, calculus, and linear algebra, particularly as applied to time-dependent data. Experience with Python programming and data science libraries (NumPy, Pandas, scikit-learn, PyTorch or TensorFlow).
- Knowledge of NLP concepts and techniques, including word embeddings, transformer architectures, and large language models.
- Proficiency in SQL and experience working with relational databases.
- Familiarity with cloud services, preferably GCP (BigQuery, Cloud Run, GCS) or similar platforms.
- Understanding of version control systems (Git) and collaborative development workflows.
- Basic knowledge of RESTful API design and web frameworks like FastAPI. Excellent problem-solving skills and attention to detail.
- Strong communication skills and ability to explain complex technical concepts to both technical and non-technical stakeholders.
- Self-motivated with a desire to learn and adapt to new technologies and methodologies.
Preferred Qualifications (Nice to Have)
- Experience with financial data, financial modeling, or quantitative analysis.
- Familiarity with time series forecasting techniques and competitions like VN1.
- Experience with infrastructure-as-code tools like Terraform. Knowledge of Docker containerization and Kubernetes orchestration.
- Prior work with agentic AI systems or RAG (Retrieval-Augmented Generation) architectures.
- Understanding of MLOps practices and tools.
- Experience contributing to open-source projects or building reusable software components.