← Back to context

Comment by jiggawatts

17 hours ago

You and the parent are both "missing the point", which is sadly not talked about by the manufacturer either (IBM).

I used to work for Citrix, which is "software that turns Windows into a mainframe OS". Basically, you get remote thin terminals the same as you would with an IBM mainframe, but instead of showing you green text you get a Windows desktop.

Citrix used to sell this as a "cost saving" solution that inevitably would cost 2-3x the same as traditional desktops.

The real benefit for both IBM mainframes and Citrix is: latency.

You can't avoid the speed of light, but centralising data and compute into "one box" or as close as you can get it (one rack, one data centre, etc...) provides enormous benefits to most kinds of applications.

If you have some complex business workflow that needs to talk to dozens of tables in multiple logical databases, then having all of that unfold in a single mainframe will be faster than if it has to bounce around a network in a "modern" architecture.

In real enterprise environments (i.e.: not a FAANG) any traffic that has to traverse between servers will typically use 10 Gbps NICs at best (not 100 Gbps!), have no topology optimisation of any kind, and flow through at a minimum one load balancer, one firewall, one router, and multiple switches.

Within a mainframe you might have low double-digit microsecond latencies between processes or LPARs, across an enterprise network between services and independent servers its not unusual to get well over one millisecond -- one hundred times slower.

This is why mainframes are still king for many orgs: They're the ultimate solution for dealing with speed-of-light delays.

PS: I've seen multiple attempts to convert mainframe solutions to modern "racks of boxes" and it was hilarious to watch the architects be totally mystified as to why everything was running like slow treacle when on paper the total compute throughput was an order of magnitude higher than the original mainframe had. They neglected latency in their performance modelling, that's why!

The mainframe itself (or any other platform for that matter) is not magical with regards to latency. It's all about proper architecture for the workload. Mainframes do provide a nice environment for being able to push huge volumes of IO though.

  • > The mainframe itself (or any other platform for that matter) is not magical with regards to latency.

    Traveling at c, if a signal travels 300 mm (30 cm; 12") that is one nanosecond. And data signals do not travel over fibre or copper at c, but slower. Plus add network device processing latency. Now double all of that to get the response back to you.

    When everything is with-in the distance of one rack, you save a whole lot of nanoseconds just by not having to go as far.

    • More to the point, to transmit a 1500 byte packet at some network data rate takes time. At 10 Gbps this is 3 microseconds for the round-trip even for a hypothetical "zero length" cable.

      Then add in the switching, routing, firewall, and load balancer overheads. Don't forget the buffering, kernel-to-user-mode transitions, "work" such as packet inspection, etc...

      The net result is at least 50 microseconds in the best networks I've ever seen, such as what AWS has between modern VM SKUs in the same VPC in the same zone. Typical numbers are more like 150-300 microseconds within a data centre.[1]

      If anything ping-pongs between data centres, then add +1 milliseconds per hop.

      Don't forget the occasional 3-way TCP handshake plus the TLS handshake plus the HTTP overheads!

      I've seen PaaS services talking to each other with ~15 millisecond (not micro!) latencies.

      [1] It's possible to get down to single digit microseconds with Infiniband, but only with software written specifically for this using a specialised SDK.

  • Again, missing the point. Just look at the numbers.

    Mainframe manufacturers talk about "huge IO throughputs" but a rack of x86 kit with ordinary SSD SAN storage will have extra zeroes on the aggregate throughput. Similarly, on a bandwidth/dollar basis, Intel-compatible generic server boxes are vastly cheaper than any mainframe. Unless you're buying the very largest mainframes ($billions!), then for the same price a single Intel box will practically always win if you spend the same budget. E.g.: just pack it full of NVMe SSDs and enjoy ~100GB/s cached read throughput on top of ~20GB/s writes to remote "persistent" storage.

    The "architecture" here is all about the latency. Sure, you can "scale" a data centre full of thousands of boxes far past the maximums of any single mainframe, but then the latency necessarily goes up because of physics, not to mention the practicalities of large-scale Ethernet networking.

    The closest you can get to the properties of a mainframe is to put everything into one rack and use RDMA with Infiniband.

    • You have to think of the mainframe as a platform like AWS or Kubernetes or VMWare. Saying “AWS has huge throughput” is meaningless.

      The features of the platform are the real technical edge. You need to use those features to get the benefits.

      I’ve moved big mainframe apps to Unix or windows systems. There’s no magic… you just need to refactor around the constraints of the target system, which are different than the mainframe.

      2 replies →

    • > The closest you can get to the properties of a mainframe is to put everything into one rack and use RDMA with Infiniband.

      Or PCIe... I really would like to try building that.

      2 replies →

> The real benefit for ... Citrix is: latency.

I understand your point about big iron, but where does something like Citrix reduce latency?

My estimation would be: Compared to a desktop Citrix adds distance and layers between the user and the compute/storage/etc., and the competition for resources on the Citrix server would tend to increase latency compared to the mostly idle desktop.

  • Think WAN latency. Your shitty VB6 app using some awful circa 2002 middleware dies when you have 80ms of latency between you and the server.

    Insert Citrix, and the turd app is 3ms away in the data center and works.

    I used to run a 100k user VDI environment. The cost was easily 4x from a hardware POV, but I had 6 guys running it, and it was always consistent.

    • Good points, thanks. But please tell me you didn't have 100k people using that VB6 app with the middleware!

I’d love to read more about these projects. In particular, were they rewrites, or “rehosting”? What domain and what was the avg transaction count? Real-time or batch?

  • Citrix is almost always used to re-host existing applications. I've only ever seen very small utility apps that were purpose designed for Citrix and always as a part of a larger solution that mostly revolved around existing applications.

    Note that Citrix or any similar Windows "terminal services" or "virtual desktop" product fills the same niche as ordinary web applications, except that Win32 GUI apps are supported instead of requiring a rewrite to HTML. The entire point is that existing apps can be hosted with the same kind of properties as a web app, minus the rewrite.