manal blog

Written by Ashnik Team

| Jan 13, 2026

4 min read

When SFTP Becomes a Risk: Rebuilding Enterprise File Transfers with a Centralized Messaging Platform

For a long time, file transfers in enterprise systems have been handled using SFTP and a growing collection of scripts around it. In our environment, this approach had become increasingly fragile. Password expiry issues, dependency on manual scripts, and operational overhead were showing up repeatedly, especially during time-bound and critical workflows.

As the project lead, I was responsible for addressing this problem end-to-end. Along with my team at Ashnik, we took a step back and questioned whether continuing with SFTP-based transfers was sustainable. The answer was straightforward. We needed a centralized, controlled, and reliable way to move files and data without increasing operational risk.
This article captures how we approached that problem and what we built as a team.

Defining the Objective Clearly

Before making any technical decisions, we aligned internally on a clear objective.

The goal was not simply to replace SFTP with another mechanism. It was to remove the operational fragility caused by scattered scripts and credentials, while introducing a centralized messaging platform that could reliably transfer files and data during controlled operational windows.

We focused on building something predictable, auditable, and resilient. This clarity helped the Ashnik team make consistent decisions throughout the design and implementation phases.

Setting a Disciplined Scope

One of the most important aspects of this project was keeping the scope sharply defined.

Our scope included migrating existing SFTP-based file transfers to a centralized messaging platform using Kafka, building Java-based custom plugins for source and target systems, and implementing contingency scripts for failure scenarios.

Cluster setup, monitoring, and regular patching were treated as part of the platform itself, not as post-delivery tasks. This approach helped us maintain operational readiness from day one and avoided introducing hidden dependencies later.

Technology Choices and Their Rationale

We designed the platform around Java-based plugins that support JDK 8 and above, independent of the underlying operating system. This allowed the solution to integrate cleanly with a wide range of application environments.

At the core, we used Confluent Kafka running in KRaft mode to establish a centralized messaging platform. This allowed us to operate without ZooKeeper and align with a modern Kafka control plane.

Security was built into the design from the beginning. All communication is encrypted using TLS, and access to topics is controlled through role-based access control. Artifacts are packaged as JARs and distributed in a controlled manner, and infrastructure setup is automated using Ansible to ensure consistency.

Monitoring is handled through the ELK stack, providing visibility into platform behavior. For DC and DR scenarios, orchestration workflows are used to manage controlled switchovers.

These choices reflect the Ashnik team’s experience in building platforms that are designed not just to run, but to be operated reliably.

How the Platform Works End to End

manal blog img1

On the source side, the plugin continuously monitors configured directories. When a trigger file is detected, the plugin checks for the presence of the associated data file. If the associated file is missing, the event is moved to an error directory.

Once the associated file is available, a checksum is calculated and attached to the message. The file is then sent to Kafka in chunks.

On the target side, the sink plugin consumes the file from Kafka and verifies that all chunks have been received. If chunks are missing, the condition is logged as an error. When all chunks are present, the checksum is recalculated and validated before the file is written to the target directory.

This flow ensures that file integrity is preserved while allowing the system to handle transient failures in a controlled manner.

Architecture and Application Integration Model

manal blog img2

We deliberately chose an application-embedded plugin model.

Source applications run a Kafka source plugin that connects to the centralized Kafka cluster and produces data to topics. Sink applications run a corresponding Kafka sink plugin that consumes data from those topics.

The centralized messaging platform includes Kafka brokers, controllers, connectors, and schema registry components. This architecture allows multiple applications, servers, and directories to participate without tightly coupling them to one another.

This design gives us a scalable and consistent integration model across the environment.

Security and Access Control Design

manal blog img3

Client authentication is handled using JAAS configuration. All communication between clients and brokers is encrypted using TLS.

Authorization is enforced using ACLs and role-based access control, ensuring that applications can only access the topics they are permitted to. Communication between brokers and controllers is also secured.

Security was treated as a foundational requirement, not something to be added later, and the platform reflects that discipline.

Capabilities Enabled in the Live Platform

In production, the platform supports transferring files from multiple source directories on the same server, as well as across multiple servers. It also supports scenarios where multiple files are associated with a single trigger file.

State is maintained so that legacy consumers using the older SFTP method can continue to function alongside streaming-based consumers. Trigger files are sent along with the original data, preserving existing workflow semantics.

The platform includes configurable retry mechanisms that attempt delivery when services are restored. Contingency scripts are available for controlled manual execution when required.

Integrity checks, audit logs, and alerts for incorrect YAML configuration are built into the system to support operational visibility and governance.

What Changed After Implementation

Operationally, this platform reduced our dependence on scattered scripts and manual interventions. File and data transfers are now handled through a single centralized messaging platform.

More importantly, it increased confidence in file integrity and delivery during critical workflows. The behavior of the system is predictable, observable, and easier to manage.

Closing Reflection

This initiative was not just about replacing a protocol. It was about introducing discipline into how files and data move across systems.

Design choices like trigger files, checksum validation, and controlled retries may seem small in isolation, but together they make the platform trustworthy in production.

Leading this effort with the Ashnik team reinforced a simple lesson for me. Reliable platforms are built through careful constraints, not complexity. When those constraints are respected, the system earns the right to be trusted.


Go to Top