← Back to context

Comment by AnthonyMouse

4 years ago

They're also running up against significant competition in the form of extremely inexpensive local hardware.

Ten or twenty years ago a company with a thousand employees typically needed multiple racks full of servers, in some cases to handle the load but in many cases just because each separate service would have its own physical machine.

Today all of that can fit on a pair of local physical machines hosting virtual guests. A single machine can have over a hundred cores and terabytes of memory. Moreover, a physical machine can be amortized over ten years provided you have a load that doesn't vary significantly over time. And because it's such a small number of physical machines, you no longer need exotic local power and cooling solutions.

Cloud providers are already more expensive than this in many cases. They need every cost advantage they can get just to be in the game.

Hardware is cheap. People are expensive. Besides that, procuring resources with your cloud provider is simply a matter of writing a yaml file. Not to mention the lack of an upfront investments and only paying for your resources you need instead of having hardware that you are spending money on because you have to have enough hardware to handle peak load. You would be amazed at the amount of resources you can buy at the cost of the fully allocated salary of one engineer.

And yes you can buy hardware. But can you run a data center in multiple regions? Besides that any cloud provider offers more than just a bunch of VMs. AWS alone has 260 services with an entire team of people keeping them patched and optimized. I don’t keep up with Azure as carefully. But this isn’t meant to be an Azure vs AWS comment. I just don’t know Azure.

  • > Hardware is cheap. People are expensive.

    Except that you still need the people, because most of the labor isn't putting the hardware in the rack, it's managing the software which you have to do regardless of where the hardware is.

    > Besides that, procuring resources with your cloud provider is simply a matter of writing a yaml file.

    That is no different than it is locally.

    > Not to mention the lack of an upfront investments and only paying for your resources you need instead of having hardware that you are spending money on because you have to have enough hardware to handle peak load.

    But hardware is cheap, remember? And most companies don't actually have large load variations.

    > But can you run a data center in multiple regions?

    Obviously yes. Any company of non-trivial size would have multiple sites and could locate a host at more than one. This doesn't even necessarily raise the price, because you already need enough machines to provide redundancy, so locating them at different sites doesn't even require additional hardware, only locating some of the existing hardware at other sites.

    This is also mostly overrated for companies smaller than that, because cloud providers have had company-wide outages at a frequency not all that much higher than site-wide outages for sites that have a reasonable level of redundancy.

    > Besides that any cloud provider offers more than just a bunch of VMs. AWS alone has 260 services with an entire team of people keeping them patched and optimized.

    This is only relevant if you're using 260 different services and not just a bunch of VMs, and plenty of companies are using just a bunch of VMs.

    • As a manager of a team of application product developers I can tell you, the headcount cost of ops teams & the time cost of taking people whose job shouldn’t involve vm provisioning overhead but nonetheless does are both huge compared to cloud services. In cloud tooling, my team of people all with zero experience doing vm provisioning can get production systems up, add logging, add alerting, add networking, etc., all very easily or with just low touch overhead from teams that manage best practices or compliance. Creating the same internal developer tool experience with data centers is SO expensive and requires a major headcount investment.

      5 replies →

    • For context. My first exposure to the cloud was at my last company of 100 employees. We aggregated publicly available (ie no PII) health care provider data from all 50 states and government agencies as well as various disease/health dictionaries and we combined it with data sent to us from large health systems.

      These are the services we used.

      Infrastructure

      - Route 53 (DNS)

      - SQS/SNS (messaging)

      - Active Directory.

      - Cognito (SAML/SSO for our customers)

      - Parameter Store/DynamoDB (configuration)

      - CloudWatch (logging, monitoring, alerts, scheduling)

      - Step functions (orchestration)

      - Kinesis (stream processing). We were just introducing this when I left. I’m not sure what they were using it for.

      CI/CD

      We used GitHub for source control.

      - CodePipeline (CI/CD orchestration)

      - CodeBuild (Serverless builds. It would spin up a Windows or Linux Docker container and basically run PowerShell or Bash commands)

      - self hosted OctopusDeploy server.

      Data Storage

      - S3 (Object/File storage)

      - Redshift (OLAP database)

      - Aurora/MySqL (OLTP RDMS). When we had large indexing to do to ELasticSearch, Read Replicas would autoscale.

      - ElasticSearch

      - Redis

      Data Processing

      - Athena (Serverless Apache Presto processing against S3)

      - Glue (Serverless PySpark environment)

      Compute

      - EC2 (Pet VMs and one autoscaling group of VMs to process data as it came in from clients. It ran a legacy Windows process)

      - ECS/Fargate (Serverless Docker cluster)

      - Lambda (for processes where we needed to scale from 0 to $alot for backend processes)

      - Workspaces (Windows VMs hosted in the US as Dev machines for our Indian Developers who didn’t want to deal with the latency.)

      - Level 7 load balancers

      Front end

      - S3 (hosted static assets like html, JS, CSS. You can serve S3 content as a website.)

      - CloudFront (CDN)

      - WAF (Web Application Firewall)

      All of the above infrastructure was duplicated for five different environments (DEV, QAT, UAT, Stage, Prod). In Prod, where needed, infrastructure was duplicated in multiple available zones (not regions).

      Where applicable, backups were automated.

      We had two full time operations people. The rest was maintained by developers. ——- as far as the rest.

      > [Procuring resources] is no different than it is locally.

      I can go from no infrastructure to everything I just named in a matter of hours locally? I can set up a multi availability zone Mysql database with automated backups just by running a yaml file locally and then turn it off when not needed?

      4 replies →