Site Reliability-Engineering Jobs

102 jobs found

full time
onsite/hybrid in new-york united-states

## About Us Ether.fi is one of the largest crypto companies in the world. We have over $10B in assets under management and a team of 30, almost all technical staff. We’ve been profitable from day one and are focusing on building real consumer applications. The future we envisage is true on-chain banking. Our current products—Stake, Liquid, and Cash—enable users to earn yield in a variety of DeFi strategies and spend their assets in the real world with our credit cards. You can view public analytics on Dune for Cash and Stake. ## Position Overview We’re looking for a senior full backend software engineer to own core backend architecture, card transaction processing, and high availability. You will belong to a key engineering team focusing on delivering distributed, scalable, and robust backend systems. This includes everything from card transaction processing and low-latency APIs to microservice architectures and on-chain data indexing. Anything that a neobank would build is on our roadmap. ## What You’ll Do * Own core areas of our backend systems and extend current functionality to improve throughput and latency. * Design, develop, and optimize scalable transaction processing services to support high-throughput, low-latency financial transactions. * Tackle the challenges in user friction related to their card transactions being declined. * Drive improvements in our Cloud Infrastructure, CI/CD pipelines, automation, observability, and monitoring systems. ## What We Are Looking For * Strong fundamentals in software engineering and problem solving. * You have shipped impressive products with thousands of real users. * 5+ years of professional coding experience. * Drive, willingness to work hard, and go the extra mile. We are a highly ambitious, driven team with audacious goals, and we work very hard to make them a reality. * **Plus:** Experience with credit card transaction processors such as Paymentology, Episode 6, etc. * **Plus:** Experience on the card issuing side. ## Perks and Benefits * Competitive salary, performance-based incentives, and token allocation grant. * Opportunity to work on multi-billion dollar products as part of a small team. * Health, dental, and vision insurance plans. * Global team with opportunity for travel and working out of our 3 offices around the world. * 4 weeks work from anywhere. * Exciting company events and team-building off-sites. *When applying, mention the word CANDYSHOP to show you read the job post completely.*

deficryptoweb3+5 more
View Details
full time
onsite/hybrid in san-jose united-states

## Job Title: Executive Vice President Operations ## Department Executive Office / Global Operations & Execution Strategy ## Reporting to Chief Executive Officer / Chief Operating Officer ## Location San Francisco, California / Hybrid Office ## Company Introduction **TradeForce AI** is a blockchain technology-driven company focused on building intelligent, scalable infrastructure for digital asset markets, decentralized finance (DeFi), and next-generation trading ecosystems. Headquartered in San Francisco, TradeForce AI operates at the convergence of artificial intelligence, blockchain, and financial market innovation—delivering advanced solutions for trading optimization, liquidity intelligence, and automated market execution. By integrating machine learning models, real-time data analytics, and decentralized technologies, TradeForce AI empowers institutions, trading firms, and digital asset platforms to enhance execution efficiency, optimize liquidity strategies, and navigate complex, high-velocity market environments. TradeForce AI partners with crypto exchanges, liquidity providers, institutional investors, and Web3 ecosystems to deliver technology-enabled solutions that improve transparency, performance, and scalability across global digital markets. As part of its strategic growth, TradeForce AI is establishing San Francisco as its global operations, execution, and infrastructure hub—driving operational excellence, platform scalability, and cross-border market execution capabilities. TradeForce AI envisions a future where intelligent automation, decentralized infrastructure, and financial markets converge to create highly efficient, autonomous, and resilient global trading systems. ## Job Overview The **Executive Vice President, Operations** will serve as a senior executive leader responsible for defining, building, and executing TradeForce AI’s global operational strategy, infrastructure, and execution framework. Reporting directly to the CEO and COO, this role will oversee all aspects of operations—including platform operations, trading infrastructure, business operations, process optimization, and cross-functional execution—ensuring alignment with the company’s blockchain-driven technology model and rapid growth objectives. The EVP of Operations will play a critical role in scaling TradeForce AI’s operational capabilities, ensuring system reliability, execution efficiency, and organizational alignment across global teams and markets. The ideal candidate brings deep experience in fintech, crypto, trading infrastructure, or high-growth technology environments, with a proven ability to build scalable operations, manage complex systems, and drive execution excellence in fast-moving, data-intensive ecosystems. ## Key Responsibilities ### Operational Strategy & Execution Leadership - Define and lead TradeForce AI’s global operations strategy aligned with corporate vision and growth priorities. - Build scalable operational frameworks supporting trading infrastructure, platform performance, and client delivery. - Ensure seamless execution across product, engineering, trading, and commercial functions. - Drive operational excellence in high-frequency, real-time market environments. ### Platform Operations & Infrastructure Management - Oversee the performance, reliability, and scalability of TradeForce AI’s trading and blockchain infrastructure. - Establish operational standards for system uptime, latency optimization, and execution efficiency. - Collaborate with engineering and data teams to ensure robust architecture and continuous system improvement. - Manage operational risk related to platform performance, cybersecurity, and infrastructure resilience. ### Business Operations & Process Optimization - Design and implement efficient business processes across trading, client onboarding, and operational workflows. - Optimize end-to-end operational pipelines to support growth, scalability, and cost efficiency. - Establish standardized operating procedures (SOPs) and execution frameworks across functions. - Drive automation initiatives to enhance operational speed and reduce manual dependencies. ### Growth Enablement & Global Expansion - Support TradeForce AI’s expansion into new markets through scalable operational infrastructure. - Collaborate with commercial and partnership teams to enable efficient client integration and onboarding. - Adapt operational frameworks to align with regional regulatory, technological, and market requirements. - Enable cross-border trading operations and global execution capabilities. ### Data, Performance & Operational Analytics - Implement KPIs, dashboards, and real-time monitoring systems to track operational performance. - Leverage data analytics to identify inefficiencies, optimize execution, and improve decision-making. - Drive continuous improvement through performance measurement and operational insights. - Align operational metrics with business outcomes, revenue growth, and platform performance. ### Governance, Risk & Compliance Alignment - Ensure operational processes align with regulatory requirements, internal controls, and risk frameworks. - Collaborate with Legal, Compliance, and Risk teams to manage operational and market risks. - Establish controls around trading operations, data integrity, and system security. - Support audit readiness and governance best practices in a rapidly evolving regulatory landscape. ### Leadership & Organizational Development - Build, lead, and mentor a high-performing global operations organization. - Foster cross-functional collaboration across engineering, trading, product, and commercial teams. - Develop operational leadership pipelines and succession planning strategies. - Promote a culture of accountability, speed, precision, and execution excellence. ## Qualifications - 20+ years of progressive leadership experience in operations, infrastructure, or execution roles within fintech, crypto, capital markets, or high-growth technology companies. - Proven success in building and scaling operational frameworks in complex, data-driven environments. - Strong understanding of trading systems, blockchain infrastructure, digital assets, or financial market operations. - Experience managing cross-functional teams and working closely with executive leadership. - Demonstrated ability to operate in high-speed, high-stakes environments requiring precision and reliability. - Bachelor’s degree in Engineering, Computer Science, Finance, or related field required; MBA or advanced degree preferred. ## Core Competencies - **Operational Excellence:** Builds scalable, high-performance operational systems. - **Execution Discipline:** Delivers consistent, reliable execution in complex environments. - **Infrastructure & Systems Thinking:** Understands the interplay between technology, data, and operations. - **Data-Driven Optimization:** Uses analytics to enhance efficiency and performance. - **Risk & Control Awareness:** Balances innovation with operational stability and compliance. - **Leadership & Alignment:** Drives cross-functional execution and organizational cohesion. ## Benefits - Competitive executive compensation with performance-based incentives. - Eligibility for equity participation and long-term incentive programs. - Comprehensive health, insurance, and retirement benefits. - Hybrid work model with San Francisco headquarters. - Opportunities to represent TradeForce AI at global blockchain, crypto, and fintech conferences. ## Expected Outcomes (12–18 Months) Establish a scalable, resilient global operations infrastructure supporting TradeForce AI’s growth.

blockchaincryptodefi+4 more
View Details
full time
onsite/hybrid in united-states

## About Mysten Labs Mysten Labs believes that decentralized and open protocols are the bedrock of the internet of value. This is why at Mysten Labs, we are creating foundational infrastructure to accelerate the adoption of decentralized protocols based on blockchain technologies. ## Position: Sr Software Engineer, Interoperability **Job Site:** Palo Alto, CA and Remote ## Responsibilities * Design and develop software solutions, supporting a broader set of RPC and data-heavy services built on top of Sui. * Utilize hands-on software engineering, including methodologies to ensure correctness and performance of implemented software components. * Be fluent in software systems’ languages such as Rust and C++. * Analyze information and utilize experience in evaluating secure software systems, including secure programming, appropriate use of cryptography, and prevention of DoS attacks. * Handle horizontal software systems in the Interop team that interface directly with the Sui network such as the Sui RPC nodes and support verticals that span teams including Cryptography, Move Platform, and Walrus by productionizing their externally facing services. * Participate in the Interop Layer, providing software systems support for the services that run on top of Sui, in all of their aspects of Sui interactions, state management, production reliability and release management. * Design and implement large distributed systems. * Utilize experience in networking, distributed systems, storage, databases, operating systems, and runtimes to be able to read research papers in the field and design systems based on them. * Be fluent in writing design documentation as well as participating in technical discussions and reviews synchronously, asynchronously, in person or remotely. ## Minimum Requirements * Master’s degree in Computer Science, Software Development, or related discipline, or foreign equivalent, and 3 years of experience in the job offered or in a closely related position. ## Special Requirements Additionally, position requires any amount of experience, gained through employment or coursework in a degree program, in each of the following: 1. C++ 2. Writing design documentation 3. Networking, distributed systems, storage, databases, operating systems, and runtimes 4. Design and evaluation of secure systems, including secure programming, cryptography, and prevention of DoS attacks ## Additional Information * Background checks may be required. * Remote work acceptable. * Proof of authorization to work in U.S. is required if hired. * The company is an Equal Opportunity Employer and fully supports affirmative action practices. *Employment is contingent upon the successful completion of a background check, which may include verification of employment history, education credentials, criminal history, and other relevant information.* *Regarding the recent rash of technology job scams: Be aware that emails from genuine Mysten Labs group recruiters will always come from the @mystenlabs.com domain or related subdomains (e.g., mystenlabs.com/careers). Remember: you can always verify positions on our job boards at www.mystenlabs.com/careers.* *To support an efficient and fair hiring process, we may use technology-assisted tools, including artificial intelligence (AI), to help identify and evaluate candidates. All hiring decisions are ultimately made by human reviewers.* *Our team is remote first and we are hiring across the world. Here at Mysten Labs, you’ll be joining a world-class team with tremendous growth potential as we bring the next billion users to web3. We raised a $300M Series B round from top Silicon Valley led venture funds like Jump Crypto, Andreessen Horowitz (a16z), Binance Labs, Redpoint, Lightspeed, Coinbase Ventures, Electric Capital, Standard Crypto, NFX, Slow Ventures, Scribble Ventures, Samsung Next, Lux Capital, among other investment firms and strategic partners. Come join us and build the future of web3!* **Contact:** Apply at https://www.jobpostingtoday.com/ Ref# 73447. *When applying, mention the word CANDYSHOP to show you read the job post completely.*

rustblockchainweb3+2 more
View Details

15 days ago

full time
onsite/hybrid in new-york united-states

## About Polymarket **Polymarket** is the world’s largest prediction market platform. We enable individuals to express views on real-world events by trading on outcomes across politics, economics, sports, culture, and current affairs. Built as a peer-to-peer marketplace with no centralized “house,” Polymarket aggregates diverse opinions into transparent, market-based probabilities that reflect collective expectations about the future. We’re growing fast – both in terms of volume ($21B traded in 2025) and adoption as an alternative news source. Our ambition is to become a ubiquitous beacon of truth in global media and we need your help adding fuel to the fire. ## About the role Polymarket is looking for an **AI-native Mobile Engineer** to build the automation layer that powers our mobile development team. This is not a traditional mobile product engineering role – you’ll design and implement AI-driven systems that make our mobile engineers dramatically more efficient. You’ll work at the intersection of mobile development and AI tooling, building orchestration systems that automate everything from ticket creation to code generation, testing, and review. The goal is to push toward a fully autonomous mobile development pipeline. This is a high-ownership role for someone who is deeply curious about the frontier of AI-assisted engineering and wants to rethink how mobile software gets built. ## What you’ll do * **Build AI-powered development workflows.** You’ll design and implement systems that automate the mobile development lifecycle, from issue creation to pull requests and code review. * **Own mobile-specific AI tooling.** You’ll tackle challenges unique to mobile environments (e.g., compilation constraints, CI on macOS, simulator workflows) and build systems that integrate seamlessly with them. * **Improve the feedback loop.** You’ll optimize and re-architect our current tooling stack (e.g., Slack bots, ticketing systems, codegen tools, CI pipelines) to reduce latency and increase reliability. * **Develop orchestration systems.** You’ll connect tools like LLMs, design systems, and codebases into cohesive pipelines that can reason about and modify mobile codebases. * **Collaborate with mobile engineers.** You’ll work closely with the iOS team to identify bottlenecks and build tools that meaningfully improve developer velocity. * **Experiment aggressively.** You’ll stay on the cutting edge of AI tooling and continuously evaluate new models, frameworks, and approaches. * **Contribute to mobile code when needed.** During high-priority moments, you may occasionally jump into the codebase to support product development. ## What we’re looking for * 2+ years of experience building native iOS (and ideally Android) applications * Proficiency in Swift/Kotlin, with a solid understanding of mobile build systems and tooling * Strong programming skills in TypeScript (or willingness to ramp quickly) * Experience building or experimenting with AI-powered developer tools, agents, or automation systems * Familiarity with LLMs and modern AI tooling (e.g., code generation, agents, orchestration frameworks) * A systems mindset — you think in terms of pipelines, automation, and leverage * High agency and curiosity — you proactively explore new tools, ideas, and approaches * *(Plus)* Experience building internal developer tools or CI/CD systems * *(Plus)* Familiarity with mobile CI environments (e.g., macOS build systems, simulators, provisioning) * *(Plus)* Experience with tools like Cursor, Claude, or similar AI coding assistants * *(Plus)* Active engagement with the AI engineering ecosystem (Twitter, blogs, open-source, etc.) * *(Plus)* Experience integrating with tools like Slack, Linear, or Figma APIs ## Benefits * Competitive salary & equity * Unlimited PTO, Health, Vision, & Dental coverage * 401k match * Hardware setup — new MacBook Pro, big display, & accessories When applying, mention the word **CANDYSHOP** to show you read the job post completely.

machine-learningtypescriptcicd+2 more
View Details
full time
onsite/hybrid in london united-kingdom

## Building the Future of Crypto Our Krakenites are a world-class team with crypto conviction, united by our desire to discover and unlock the potential of crypto and blockchain technology. **What makes us different?** **Kraken** is a mission-focused company rooted in crypto values. As a Krakenite, you’ll join us on our mission to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. For over a decade, Kraken’s focus on our mission and crypto ethos has attracted many of the most talented crypto experts in the world. Before you apply, please read the Kraken Culture page to learn more about our internal culture, values, and mission. We also expect candidates to familiarize themselves with the Kraken app. Learn how to create a Kraken account here. As a fully remote company, we have Krakenites in 70+ countries who speak over 50 languages. Krakenites are industry pioneers who develop premium crypto products for experienced traders, institutions, and newcomers to the space. Kraken is committed to industry-leading security, crypto education, and world-class client support through our products like Kraken Pro, Desktop, Wallet, and Kraken Futures. **Become a Krakenite and build the future of crypto!** ## Proof of work ## The team The Platform Team exists to accelerate engineering across Kraken by building and evolving the foundational infrastructure that powers development at scale. Its mission is to remove friction, solve systemic platform challenges, and ensure teams can ship quickly, reliably, and efficiently. The team defines and enforces engineering standards across APIs and services, improving consistency, performance, and scalability across the stack. It drives platform-wide architecture improvements, strengthens service interfaces, and ensures long-term maintainability as the system evolves. A core focus is performance and scale — proactively identifying bottlenecks, improving observability and tracing, and ensuring the platform stays ahead of production demands. The team establishes strong testing and performance baselines, enabling teams to automate stress testing and build confidently on a resilient foundation. Operating cross-functionally across engineering, the Platform Team works at every layer of the stack — from low-level networking to service architecture — strengthening the core systems that enable the entire organization to execute faster and at higher quality. ## The opportunity - Collaborate closely with Product, DevOps, SRE, and Security teams to ensure Kraken's platform is reliable, secure, and scalable - Mentor senior and mid-level engineers, influencing best practices in system design, testing, and performance optimization - Take ownership of system-wide architectural initiatives and shape long-term technical strategy - Contribute to the team’s incident response, root cause analysis, and system hardening efforts - Champion continuous improvement by introducing new tools, techniques, and technologies that raise the bar for backend engineering ## Skills you should HODL - 10+ years of experience designing and implementing high-performance backend systems, preferably in finance, trading, or distributed environments - Deep expertise in at least one systems language (C++, Go, or Rust) and solid scripting knowledge in Python - Strong familiarity with Linux systems, including low-level debugging, concurrency, and profiling - Experience building and optimizing low-latency, high-throughput services handling large-scale transaction volumes - Demonstrated ability to make data-driven architectural decisions and communicate trade-offs clearly - Experience with distributed systems, messaging queues, and inter-service communication protocols (gRPC, REST, etc.) - Familiarity with cryptocurrency markets, DeFi, and blockchain protocols is a major plus - BS/MS in Computer Science, Engineering, or a related quantitative discipline *Unless a specific application deadline is stated in the job posting, applications are accepted on an ongoing basis.* *Please note, applicants are permitted to redact or remove information on their resume that identifies age, date of birth, or dates of attendance at or graduation from an educational institution.* *We consider qualified applicants with criminal histories for employment on our team, assessing candidates in a manner consistent with the requirements of the San Francisco Fair Chance Ordinance.* Kraken is powered by people from around the world and we celebrate all Krakenites for their diverse talents, backgrounds, contributions and unique perspectives. We hire strictly based on merit, meaning we seek out the candidates with the right abilities, knowledge, and skills considered the most suitable for the job. We encourage you to apply for roles where you don't fully meet the listed requirements, especially if you're passionate or knowledgeable about crypto! We may ask candidates to complete job-related skills or work-related tasks. As an equal opportunity employer, we don’t tolerate discrimination or harassment of any kind. Whether that’s based on race, ethnicity, age, gender identity, citizenship, religion, sexual orientation, disability, pregnancy, veteran status or any other protected characteristic as outlined by federal, state or local laws. **Stay in the know** Follow us on Twitter Learn on the Kraken Blog Connect on LinkedIn Candidate Privacy Notice When applying, mention the word **CANDYSHOP** to show you read the job post completely.

rustcryptoblockchain+5 more
View Details
full time
onsite/hybrid in georgia

# Director of AI Data Center Design **Location:** Georgia **Department:** AI – AI Data Centers **Commitment:** Full Time **Workplace Type:** On-site **About CleanSpark** CleanSpark (Nasdaq: CLSK), is a market-leading data center developer with a proven track record of success. We control a portfolio of more than 1.8 GW of power, land, and data centers across the United States powered by globally competitive energy prices. Sitting at the intersection of Bitcoin, energy, operational excellence, and capital stewardship, we optimize our infrastructure to deliver superior returns to our shareholders. Monetizing low-cost, high reliability energy by producing a global emerging critical resource – compute – positions us to prosper in an ever-changing world. Visit our website at **www.cleanspark.com**. ## Job Overview The Director of AI Data Center Design will lead the execution of CleanSpark’s AI data center design programs, ensuring delivery of scalable, standardized, and construction-ready infrastructure solutions. This role is responsible for translating established Basis of Design (BoD) standards into fully coordinated engineering designs across electrical, cooling, controls, and network systems. The Director of AI Data Center Design will oversee design execution across multiple programs or regions, ensuring alignment with operational requirements, constructability, and deployment timelines. This role plays a critical part in advancing CleanSpark’s modular, repeatable infrastructure strategy by driving consistency, quality, and efficiency across all design deliverables. The Director will partner cross-functionally with Product, Power, Construction, and Operations teams to ensure seamless design integration and successful handoff for execution. This role requires strong technical leadership, systems thinking, and the ability to operate effectively in a fast-paced, high-growth environment. ## Job Duties ### Design Leadership & Execution - Lead the development and delivery of coordinated, construction-ready design packages across assigned programs, regions, or system scopes. - Translate Basis of Design (BoD) requirements into detailed engineering designs that are scalable, repeatable, and aligned with CleanSpark standards. - Ensure design outputs meet quality, schedule, and performance expectations. ### Multi-Disciplinary Systems Integration - Oversee integration across electrical, cooling, controls, and network systems to ensure cohesive and optimized infrastructure design. - Ensure designs are fully coordinated across disciplines, addressing constructability, efficiency, and operational performance. ### Standardization & Modular Design (DfMA) - Apply Design for Manufacturing and Assembly (DfMA) principles to support modular, repeatable infrastructure deployment. - Drive consistency and adherence to standardized design frameworks, minimizing unnecessary customization. ### BIM & Design Coordination - Utilize Building Information Modeling (BIM) and other digital tools to manage design coordination and clash detection. - Lead design review meetings with internal and external stakeholders to resolve conflicts and ensure alignment.

machine-learningblockchaincloud+3 more
View Details
full time
onsite/hybrid in georgia

# Vice President of AI Data Center Design **Location:** Georgia **Department:** AI – AI Data Centers **Commitment:** Full Time **Workplace Type:** On-site --- **CleanSpark** (Nasdaq: CLSK) is a market-leading data center developer with a proven track record of success. We control a portfolio of more than 1.8 GW of power, land, and data centers across the United States powered by globally competitive energy prices. Sitting at the intersection of Bitcoin, energy, operational excellence, and capital stewardship, we optimize our infrastructure to deliver superior returns to our shareholders. Monetizing low-cost, high reliability energy by producing a global emerging critical resource – compute – positions us to prosper in an ever-changing world. Visit our website at **www.cleanspark.com**. --- ## Job Overview The **Vice President of AI Data Center Design** will lead CleanSpark’s end-to-end data center design function, establishing the strategy, standards, and execution model required to scale AI and high-performance computing infrastructure across the organization. This role is responsible for defining and operationalizing CleanSpark’s design engineering vision, ensuring all infrastructure designs are scalable, standardized, cost-effective, and aligned with long-term business objectives. The Vice President of AI Data Center Design will oversee multi-disciplinary design engineering teams and drive the development of integrated design systems across electrical, cooling, controls, and network infrastructure. This role will ensure that Basis of Design (BoD) frameworks, modular design principles, and Design for Manufacturing and Assembly (DfMA) standards are consistently applied to enable rapid, repeatable deployment of infrastructure at scale. As a key member of the leadership team, the VP will partner closely with Product, Power, Construction, Operations, and Finance to align design strategy with capital planning, operational performance, and growth initiatives. This role will play a critical part in enabling CleanSpark’s continued expansion by ensuring design excellence, speed to deployment, and long-term infrastructure reliability. --- ## Job Duties ### Design Strategy & Organizational Leadership - Define and lead the strategic vision for design engineering across CleanSpark’s data center infrastructure portfolio. - Establish and evolve design standards, Basis of Design (BoD), and engineering frameworks to support scalable growth. - Build and lead a high-performing, multi-disciplinary design engineering organization, including hiring, development, and succession planning. ### Standardization & Scalable Infrastructure - Drive adoption of modular design principles and Design for Manufacturing and Assembly (DfMA) across all infrastructure programs. - Ensure consistent application of standardized design systems to enable repeatable, cost-effective deployment. - Balance innovation with standardization to optimize performance, speed, and cost. ### Multi-Disciplinary Systems Integration - Oversee integration across electrical, cooling, controls, and network systems to ensure cohesive and optimized infrastructure performance. - Ensure design solutions are aligned with operational requirements, reliability standards, and long-term scalability. ### Design Execution Oversight - Provide executive oversight of design execution across all programs, regions, and projects. - Ensure delivery of high-quality, coordinated, and construction-ready design packages aligned with timelines and business priorities. - Establish governance, processes, and metrics to drive accountability and performance across the design function. ### Cross-Functional Alignment & Business Integration - Partner with Product, Construction, Operations, Power, and Finance to align design strategy with capital planning, project delivery, and operational performance. - Ensure seamless transition from design to construction and operations, minimizing risk and maximizing efficiency. ### Innovation & Continuous Improvement - Identify and implement opportunities to improve design efficiency, reduce costs, and enhance infrastructure performance. - Drive adoption of new technologies, tools, and methodologies to support CleanSpark’s evolving infrastructure needs. ### Risk Compliance & Governance - Ensure all designs comply with applicable codes, regulations, and internal standards. - Establish and maintain design governance processes to manage risk, ensure quality, and support audit readiness. ### Executive Leadership & Stakeholder Engagement - Serve as a key advisor to executive leadership on infrastructure design strategy and investment decisions. - Communicate design strategy, performance, and risks to senior stakeholders and executive leadership. ### Additional Responsibilities - Support strategic initiatives related to expansion, acquisitions, and new site development. - Lead special projects and additional responsibilities as assigned. *This role requires executive presence, strategic thinking, and the ability to lead complex, high-impact initiatives in a fast-paced, growth-oriented environment.* --- ## Qualifications - This role requires up to 50% travel. - Candidates must be based within one hour of a major U.S. airport to support travel requirements. - Bachelor’s degree in engineering, Architecture, or related technical field (master’s degree preferred). - 12–15+ years of progressive experience in engineering design, with significant leadership experience in data centers, mission-critical infrastructure, or industrial environments. - Proven track record of leading large-scale, multi-disciplinary design engineering organizations. - Deep expertise in MEP systems and integrated infrastructure design. - Experience driving design standardization, modular infrastructure, and scalable deployment models. - Demonstrated ability to influence cross-functional stakeholders and align technical strategy with business objectives. - Strong leadership and communication skills.

machine-learningcloudsite-reliability-engineering
View Details
full time
remote in toronto canada

## DevOps Engineer - Canada Wide - Remote **Location:** Toronto, Ontario **Department:** Engineering **Workplace Type:** Remote Say hello to Newton! We're changing how Canadians trade crypto. Our goal? To make financial freedom something everyone can achieve. We give our customers the tools and knowledge they need to navigate the crypto world. At Newton, you'll work with a remote team spread across Canada, but you'll never feel distant. Ready to be part of something meaningful? Join a team that’s all about pushing boundaries and getting things done. **Some of our values:** - **Customer first mindset** - Commitment to integrity and transparency to our users! - **A dynamic team fueled by collaboration** uniting our strengths to overcome any obstacles. Together we build success. We persevere, adapt, and come back stronger, turning obstacles into opportunities. - **We strive for continuous improvement** and embrace creativity and encourage experimentation. We push the boundaries of what’s possible and continuously explore new ideas, technologies, and solutions. ### Role Overview We are searching for a DevOps Engineer to improve how we build, deploy and run our systems. This role works across infrastructure, CI/CD, observability and operational tooling in an AWS-based environment spanning backend, frontend and internal services. ### Responsibilities will include: - Improve and maintain CI/CD, deployment workflows, and environment management across backend, web, and internal services - Build, maintain and scale infrastructure across AWS and container based services - Improve monitoring, alerting, logging, dashboards, tracing, and runbooks - Work with engineers on safer deploys, rollback plans, and recovery from failures - Automate repetitive operational work and improve internal tooling - Maintain and improve infrastructure as code and deployment tooling - Help improve failover planning, recovery procedures, and backup/restore testing for critical systems - Support production systems and take part in on-call for critical services - Manage and scale infrastructure across AWS, ECS, Docker, PostgreSQL, Redis, Celery, and Go/Python-based services - Lead incident response and postmortems, and drive follow-up actions to reduce repeat issues - Improve reliability, resilience, and operational readiness across critical systems ### Who you are: - Experience running production systems in AWS or a similar cloud environment - Experience with CI/CD and infrastructure automation - Strong understanding of AWS networking, including VPCs, subnets, route tables, security groups, load balancers, DNS and connectivity between services - Comfort with Linux, shell scripting, Python, and Go - Experience with Docker and ECS or Kubernetes - Experience with GitHub Actions, Pulumi, Terraform, or similar tooling - Experience with Datadog, Prometheus, Grafana, or similar observability tools - Good understanding of PostgreSQL, Redis, queues, async workers, and scheduled jobs - Familiarity with Cloudflare or similar edge, networking or traffic management tooling - A practical approach to automation, reliability and day to day operational work - Experience with on-call and incident response for business-critical systems - Strong troubleshooting skills across application, infrastructure, and data layers At Newton, we celebrate our inclusive work environment and welcome members of all backgrounds and perspectives to apply. We are committed to providing reasonable accommodations and will work with you to meet your needs. If you are a person with a disability and require assistance during the application process, please don’t hesitate to reach out! *We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.* **Apply for this job** When applying, mention the word **CANDYSHOP** to show you read the job post completely.

awsdockerkubernetes+6 more
View Details
full time
onsite/hybrid in tel-aviv israel

**Chainalysis** is seeking a technically strong, customer-focused **Senior Technical Solutions Engineer** to serve as the escalation point for complex product and integration challenges across the Chainalysis platform. The role is focused on API troubleshooting, data operations, deployment support, and customer partnership — with additional responsibility for On-Premises deployment and maintenance workflows. This individual will work directly with customers and internal teams to diagnose and resolve technical issues, guide clients through connecting their products to ours via our APIs, and execute data operations that keep our customers running smoothly. For our On-Premises customers, when needs arise, they will step in to deploy, maintain, and troubleshoot customer environments with confidence. Success in this role requires deep technical capability, an understanding of blockchain and crypto compliance workflows, and an exceptional ability to communicate clearly with both technical and non-technical audiences. ## In this role, you'll: - Serve as the escalation point for complex technical issues across the Chainalysis product suite, including **KYT (Know Your Transaction)**, **Reactor (Cloud and On Premises)**, **Data Solutions**, and related APIs. - Act as a technical resource during customer deployment projects, guiding clients through On Premises and API integration design, onboarding, and go-live. - Diagnose and resolve API integration issues including authentication errors, rate limiting, timeout failures, and unexpected response behaviors. - Execute bulk data operations using internal scripts and SQL, including data deletions, alert closures, transaction exports, and user migrations. - Support data migration efforts, including bulk loading of historical transactions, organization-to-organization migrations, and legacy data imports. - Advise customers on SSO/Identity integration (OIDC, SAML, Okta) and platform configuration best practices. - Investigate data discrepancies, transaction processing failures, alert anomalies, and blockchain-specific issues across multiple networks and asset types. - Develop and maintain working knowledge of blockchain networks, asset types, and protocol-specific behaviors to support customer investigations and troubleshoot chain-specific issues. - Manage the On-Premises deployment lifecycle, including installation, infrastructure planning, data ingestion, version upgrades, and ongoing maintenance. - Troubleshoot On-Premises server issues including data ingestion failures, service outages, data update delays, and resource constraints. - Work within security-sensitive and restricted-access environments, including airgapped networks and law enforcement or government customer deployments. - Support customers through On-Premises version upgrades and cloud migration planning, ensuring minimal disruption and data continuity. - Travel to client sites as needed (approximately 15%) to support On-Premises software installations, upgrades, and critical maintenance. - Create and maintain technical documentation, runbooks, and SOPs that enable customers and internal teams to resolve common issues independently. - Collaborate closely with internal Product, Engineering, and Customer Success teams to escalate bugs, communicate customer feedback, and support feature rollouts. - Mentor and train support agents and peers on technical processes, tooling, and SOPs. ## We're looking for candidates who have: - 5+ years of experience in a technical support, solutions engineering, systems engineering, or similar customer-facing technical role. - Demonstrated ability to independently own and drive resolution of complex, multi-stakeholder technical issues from initial triage through to completion. - Guided customers through API integrations, including reading API documentation, troubleshooting HTTP requests/responses, and advising on integration architecture. - Written and executed SQL queries for data investigation, extraction, and bulk operations. - Built scripts or automation (Python, Bash, or similar) to perform bulk data operations and streamline repetitive tasks. - Communicated complex technical concepts to both technical and non-technical audiences across customers, support teams, and engineering. - Operated successfully in fast-paced, ticket-driven environments requiring the ability to prioritize and manage multiple concurrent issues, including urgent escalations. - Supported a global customer base across multiple time zones, adapting communication accordingly. - Deployed, maintained, or troubleshot server-based or On-Premises software in customer environments with restricted access. - Comfort working within security-sensitive environments, including law enforcement, government, or classified customer deployments. - Ideally supported cryptocurrency compliance workflows, such as transaction monitoring, alert triage, or sanctions screening. ## Technical Skills: - Strong proficiency in SQL and scripting languages (Python, Bash) for data analysis, bulk operations, and automation. - Experience troubleshooting REST APIs and guiding customers through integration design and deployment. - Experience working with large-scale transaction data and compliance alerting systems, including alert triage, bulk operations, and data quality investigation. - Expertise in Linux server administration, including CLI diagnostics and log analysis, with demonstrated ability to perform guided, remote troubleshooting and diagnostics in environments with severely limited visibility and restricted access. - Solid knowledge in hardware procurement, server deployment, systems performance tuning, and data center operational protocol compliance would be an advantage. - Familiarity with containerization and orchestration technologies (Docker, Kubernetes) and infrastructure capacity planning. - Ability to quickly develop deep expertise in Chainalysis products to map them to customer compliance and operational needs. **About Chainalysis** Blockchain technology is powering a growing wave of innovation. Businesses and governments around the world are using blockchains to make banking more efficient, connect with their customers, and investigate criminal cases. As adoption of blockchain technology grows, more and more organizations seek access to all this ecosystem has to offer. That’s where Chainalysis comes in. We provide complete knowledge of what’s happening on blockchains through our data, services, and solutions. With Chainalysis, organizations can navigate blockchains safely and with confidence. **You belong here.** At Chainalysis, we believe that diversity of experience and thought makes us stronger. With both customers and employees around the world, we are committed to ensuring our team reflects the unique communities around us. We’re ensuring we keep learning by committing to continually revisit and reevaluate our diversity culture. We encourage applicants across any race, ethnicity, gender/gender expression, age, spirituality, ability, experience and more. If you need any accommodations to make our interview process more accessible to you due to a disability, don't hesitate to let us know.

blockchaincryptocompliance+7 more
View Details

20 days ago

full time
onsite/hybrid in san-francisco united-states

## **Who We Are** Hyperbolic Labs is on a mission to democratize AI by breaking down the barriers to computing power with our Open-Access AI Cloud. By aggregating computing resources across the globe, we offer an innovative GPU marketplace and AI inference service that promise affordability and accessibility for all. As pioneers at the intersection of AI and open-source technology, we believe in an open future where AI innovation is limited only by imagination, not by access to resources. We're looking for forward-thinking individuals who share our passion for making AI universally accessible, secure, and affordable. Join us in building a platform that empowers innovators everywhere to turn their visionary AI projects into reality. As we prepare for growth after our Series A, our team — led by co-founders with PhDs in AI, Math, and Computer Science — is poised to redefine computing. ## **About the Role** We're seeking a **Senior Infrastructure Engineer** to help build and scale Hyperbolic's GPU Cloud Marketplace, by building a multi-tenancy provisioning and virtualization solution. This is a foundational role where you'll be responsible for transforming raw GPUs from diverse global suppliers into a programmable, orchestrated pool that serves thousands of AI developers and researchers. You'll work at the cutting edge of cloud infrastructure, building the core orchestration layer that enables our platform to deliver up to 75% cost savings compared to traditional cloud providers. ## **Who You Are** - Deep understanding of bare-metal provisioning and lifecycle management, including IPMI/Redfish, BMC-based remote management, PXE boot, and automated OS deployment workflows - Deep understanding of GPU scheduling and orchestration, including GPU type awareness, memory management, topology considerations, placement strategies for multi-GPU jobs, and fragmentation minimization - Strong infrastructure and DevOps engineering skills with proficiency in Terraform or Pulumi, CI/CD for infrastructure, secrets management, configuration management, and observability stack implementation - Experience with storage and data infrastructure for AI/ML workloads, including object storage, high-IOPS block storage, and distributed file systems for training data and checkpoints - Proficiency with API design and cloud-init for automated provisioning and configuration - Solid understanding of GPU architecture, CUDA, and GPU compute optimization - Highly collaborative team player with excellent communication skills across technical and non-technical stakeholders - Proven ability to work effectively with hardware vendors and vendor engineering teams to troubleshoot issues and optimize integrations - Experience building and scaling cloud infrastructure or distributed systems in production environments ## **Preferred Qualifications** - Familiarity with high-performance networking technologies such as InfiniBand and RoCE (RDMA over Converged Ethernet) - Experience with distributed storage systems such as Ceph, Weka, or VAST Data *Hyperbolic is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.* When applying, mention the word **CANDYSHOP** to show you read the job post completely.

machine-learningcloudcicd+3 more
View Details