Building a Payment Gateway: Technical Deep Dive
How we built a secure, scalable payment processing system handling thousands of transactions daily. Architecture, security, and lessons learned.

Building a Payment Gateway: Technical Deep Dive
Payment systems are critical infrastructure. One bug can cost millions. Here's how we built hsrcpay.com to handle transactions securely at scale.
The Challenge
Build a payment gateway that:
- Processes 10K+ transactions/day
- Handles multiple payment providers
- Maintains 99.99% uptime
- Complies with PCI-DSS standards
- Provides instant reconciliation
Architecture Overview
High-Level Design
Client → API Gateway → Payment Service → Provider (Stripe/PayPal)
↓
Database (PostgreSQL)
↓
Message Queue (RabbitMQ)
↓
Background Workers (Reconciliation)
Key Technical Decisions
1. Idempotency Keys
Problem: Network failures can cause duplicate charges.
Solution: Every transaction gets a unique idempotency key.
interface PaymentRequest {
idempotencyKey: string; // UUID
amount: number;
currency: string;
userId: string;
// ...
}
async function processPayment(req: PaymentRequest) {
const existing = await db.transactions.findByIdempotencyKey(
req.idempotencyKey
);
if (existing) {
return existing; // Return cached result
}
// Process new payment
const result = await paymentProvider.charge(req);
await db.transactions.save({
...result,
idempotencyKey: req.idempotencyKey
});
return result;
}
2. State Machine for Transaction Status
We use a finite state machine to track transaction states:
PENDING → PROCESSING → COMPLETED
↓
FAILED → REFUNDING → REFUNDED
This prevents invalid state transitions and makes the system predictable.
3. Webhook Handling
Payment providers send webhooks for async events:
// Webhook handler
async function handleWebhook(event: WebhookEvent) {
// 1. Verify signature
const isValid = verifySignature(event);
if (!isValid) throw new Error('Invalid signature');
// 2. Check for duplicates
const processed = await redis.get(`webhook:${event.id}`);
if (processed) return { status: 'already_processed' };
// 3. Process event
await processEvent(event);
// 4. Mark as processed
await redis.setex(`webhook:${event.id}`, 86400, 'true');
return { status: 'success' };
}
4. Security Measures
Data at Rest:
- Encrypted database
- PCI-compliant tokenization
- Never store full card numbers
Data in Transit:
- TLS 1.3 only
- Certificate pinning for mobile apps
Access Control:
- JWT with short expiry
- Rate limiting (100 req/min per user)
- IP whitelisting for webhooks
5. Reconciliation System
Daily automated reconciliation:
async function reconcileTransactions(date: Date) {
// 1. Fetch from our DB
const ourTransactions = await db.transactions.forDate(date);
// 2. Fetch from provider
const providerTransactions = await provider.listTransactions(date);
// 3. Compare and flag discrepancies
const mismatches = findMismatches(ourTransactions, providerTransactions);
if (mismatches.length > 0) {
await notifyTeam(mismatches);
await createReconciliationTickets(mismatches);
}
}
Performance Optimizations
1. Database Indexing
Critical indexes for payment queries:
CREATE INDEX idx_transactions_user_created
ON transactions(user_id, created_at DESC);
CREATE INDEX idx_transactions_status_created
ON transactions(status, created_at DESC);
2. Caching Strategy
- User limits: Redis (5 min TTL)
- Provider status: Redis (30 sec TTL)
- Transaction results: Redis (24 hours)
3. Queue-Based Processing
Non-critical operations run async:
- Email receipts
- Webhook retries
- Analytics updates
- Reporting
Monitoring & Alerting
We track:
- Success rate: Must be >99.5%
- Response time: P95 <500ms
- Failed payments: Alert if >10 in 5 min
- Provider downtime: Auto-failover to backup
Lessons Learned
1. Always Have a Backup Provider
When Stripe went down for 2 hours, our fallback to PayPal saved the day.
2. Test Failure Scenarios
We simulate:
- Provider timeouts
- Database connection loss
- Network partitions
- Race conditions
3. Observability Is Key
Without proper logging, debugging payment issues is impossible. We log:
- Every request/response
- All state transitions
- Provider API calls
- Error stack traces
4. Never Trust External APIs
Wrap provider SDKs with circuit breakers and retries:
const circuitBreaker = new CircuitBreaker(provider.charge, {
timeout: 5000,
errorThresholdPercentage: 50,
resetTimeout: 30000
});
Results
After 1 year in production:
- 99.98% uptime
- <300ms P95 latency
- Zero data breaches
- 10K+ daily transactions
- $2M+ processed monthly
Conclusion
Building payment systems is hard but rewarding. Key takeaways:
- Idempotency is non-negotiable
- Security must be paranoid
- Monitor everything
- Plan for provider failures
- Automate reconciliation
Questions about payment systems? Hit me up!
Next post: How we reduced payment processing costs by 40% with smart routing.



