gse-cloud V2

An experiment in cloud sovereignty — running my own hosting infrastructure on Linux servers. The goal is initially to self-host email with webmail and websites, If it works I can potentially offer the same to others to help cover the cost of running the infrastructure. Ultimately , I would like to expand the platform to host similar capabilities as are offered by google workspaces or Office365, at least for myself(I am excited about this). I am unlikely to move into hosting more cloud-type sevices such as Virtual Private Servers( which was one of the goals of GSECloud V1).

linux hestiacp self-hosted email prototype

what went wrong with v1

I am now working on version 2 of gse-cloud. Version 1 didn't work out for a number of reasons. In summary, I tried to make everything perfect, aiming at high availability, maximum security, and complete automation from bare metal up. All of this turned out to be extremely difficult to achieve. The challenges were numerous and ultimately insurmountable.

Automation

I placed too much emphasis on automation. There is defiitely a limit to what is practical and returns meaningful benefit.

I overdid it and it cost time that I should have dedicated elsewhere.

learning curve

Its amazing how much deep knowledge is required to build and manage infrastructure. Some things have a learning curve that was just too steep for me. HA firewall configuration, for example, is a job for a specialist with years of experience. I never got it working properly. Storage arrays, networking, bare metal provisioning,security hardening and monitoring are all examples of specialised skills that are required but very hard learn.

data centre requirements

True high availability requires data centre infrastructure: redundant internet connections, redundant power, and separate geographical locations. None of that is realisticfor a small business or individual.

storage. Cost and complexity

If one wants to be serious about data availability the "3-2-1 backup strategy is the gold standard. This requires three copies of all data, with one copy preferably not co-located. In practice it means that for every GB of storage one uses, one needs three times that in disk capacity. It gets worse if the underlying disks are RAIDed as RAID sacrifices some top line capacity to redundancy. To have lots of storage, It is therefore not enough to go out and buy a single massive disk with tons of space on it. If you did do that, you would need at leastthree of them.
Managings 3 tiers of storage seamlessly is extremely complex.
Large numbers of disks need storage arrays to host them alongide expensive fast networks. Storage arrays can cost as much as servers, even before they're populated with disks.
HDDs fail surprisingly regularly and need to be replaced.
Small form factor server discs are more expensive than regular 3.5" consumer disks.
Perhaps SSDs change the maths a little with reduced failure rates, but they do have their own issues and tradeoffs, e.g. they are more expensive per gb and they have a shorter functional lifespan.

Hardware cost

Even second-hand, refurbished servers are expensive and need refreshing eventually
Expired warranties can be an issue, especially for ongoing support with ILO software, which is essential for bare metal automation.
Server vendors require that subscriptions are paid for supported use ofiLO.
Networking equipment is expensive.
- Redundant networking requires multiple managed switches for each route, The number of devices requiredclimbs fast as you add more compute nodes, as does the complexity.
- The expense of advanced features like fibre optic networking is basically out of reach for me.

power

The more powerful the equipment the more electricity it consumes. Running costs per server per month can be as much as £35, at which point the cost/benefit calculations, on just the running costs, start to lean heavily towards public cloud, thats without even considering the already skyrocketing capital investment needed for the equipment and the time burden to manage everything.

security

Security is a bottomless pit and doing it well requires:

Ongoing active scanning,
expensive tooling
independent pen testing
physical security
compute and network partitioning
Reliable DR
Robust processes

The list goes on and on, even when you're doing it well, it's never enough to be 100% certain about anything.

the v2 approach

philosophy

outcomes over perfection

For V2, the focus shifts from chasing perfection to delivering working, useful functionality — and being honest about the trade-offs involved so users can make their own mind up.

AI fills the gaps

The world has changed a lot since V1. AI now helps fill the skills gaps that previously held me back. V2 fully embraces delivery with the help of AI.

Hybrid Cloud Strategy

use the right tool

Where public cloud providers offer cost effective services I can't realistically replicate without significant cost, I'll use them rather than try to reinvent the wheel.

where cloud helps

Secure key management, affordable storage for backups and archiving, edge compute with IPv6 support, and automatically provisionable DR are all good candidates.

EU Preference

Given the political instability in the United States at the moment EU cloud providers are preferred over USA-based ones, wher ethey have what is required.

availability & DR

frequent backups

Daily backups are the first line of defence against data loss — simple, effective, and proportionate.

realistic RTO & RPO

A simple,fit-for-purpose DR strategy guards against the expectedequipment and infrastructure failure. Setting achievable recovery time and point objectives keeps expectations clear. Point in time recovery is not achievable or even neccesarily required in most cases

scope

what this is for

Personal sites, email, and hobbyist hosting. Resources that aren't valuable or time-critical — where a brief outage is an inconvenience, not a crisis.

what this isn't for

Business-critical services, sensitive commercial or ersonal data don't belong here. The Ts&Cs will say so explicitly, and I won't be accepting liability for those use cases.

security

encrypt everything

Expect successful attacks, All data at rest and in transit must be encrypted as a baseline so even if attackers do get their hands on it, It is uselesss to them.

zero trust

No implicit trust between services or components.

good basics

Strong passwords, regular updates, only ports tht are needed for services. Anything simple but effective.

reduce impact

Reduce the impact of risks rather than the likelihood by only hosting resources that aren't valuable, sensitive or time-critical, No guarantees of safety or availability.