Posts by SpinningOps Editor

This is why autoscaling is essential for SaaS growth

system design February 4, 2025

Businesses

As the famous quote says: “Every business goal is to create and keep a customer.”

Business Growth

For businesses that offer SaaS to their customers, including scale is essential.
Why? To keep up with the economy.

All businesses can do business online, even if it’s just a simple website that can bring new customers or an online platform for users to do something.

Business growth for SaaS is Autoscaling.

So, business growth for SaaS is autoscaling; why? More customers mean more computing resources to provide for the business’s services.

System Design

Imagine that the company you work for increases its customer base, so more customers are using the SaaS platform; how can the system handle the load?

Autoscaling

Ensure that the SaaS platform you’re building can SCALE!

Compute resources, APIs, databases, load balancers, and every component in the stack.

This approach will ensure that the SaaS platform can handle increasing demand.

Do You Overpay for Your Cloud Usage

Budget Cloud Costs August 31, 2023

Getting to know if you’re overpay is essential for your budget, in this post we’ll discuss the cloud budget.

The steps to understand if you’re overpaying are:

Check your latest cloud billing.
Compare your latest cloud billing with the previous billing reports.
Understand what each item in the billing report does.
Extract the revenue produced from the items from the billing report.
Identify spikes in billing.

Check Your Latest Cloud Billing

Start by going to your cloud provider billing page and download the latest billing report.

Then print it so you can review each item and add it’s description.

Compare Your Latest Cloud Billing Report /w The Previous Billing Reports

This will give you a general overview of how much your systems are costing your company and if there is an increase in costs.

Summaries each previous months in that calendar year to get the full picture.

Understand What Each Item In The Billing Report Is

Get the billing report you printed earlier and start writing on each row what is item is for.

For example: “dev servers” are for developers working on project “new feature”.

Go over the billing report and make sure that every line is explained, Then focus on the top three most expensive items and try to understand if that can be improved.

Extract The Revenue Produced From The Items From The Billing Report.

This part is a bit tricky but possible.

Go over the revenue that your product is generating and try to connect it with the specific component that is in the billing report.

For example: Let’s say you have a SaaS that people use to chat with other people, so the main components are: Load Balancer, Main App, Databases that are all necessary for the users to use the application.

So those components are a direct generators of the revenue.

Identify Spikes In Billing.

If you check your billing report monthly then it’s easier for you to identify a spike in billing, and since you know what it item does then you know what service and component the spike is related to and fix it.

P.S.

Check OUR INSTAGRAM PAGE for a summary of this post.

Cloud Migration Is Easy

DevOps May 4, 2023

Home » Archives for SpinningOps Editor

Cloud migration can be easy if you plan it and understand how cloud works

Cloud vs. On-premises

Both cloud and on-premises are just servers that act as compute resources but very different in approach, maintenance and information security.

For example, when your company have their own servers farm (it can be a server room in the office or at a hosted facility) the amount of maintenance is extremely high compares to cloud.

Some of the maintenance tasks can be:

Physical alarm system
Air conditioning monitoring
Operating network devices (switches, routers and firewalls)
Local backup and remote backup
Hardware maintenance (servers)
Operating local cloud (hypervisors & VM’s)
etc

For cloud it’s mostly:

Operating services
Optimizing billing
Cloud security
Automation
etc

It’s very different and that is why companies struggle with migrating their compute infrastructure to the cloud.

Cloud approach vs. On-premises approach

The main reason companies struggle to do an easy migration to the cloud is approach, they think that what worked with on-premises will work in the cloud too.

The two main positions for those two approaches are cloud engineer and system administrators, and both come from a different background (but that’s on a separate blog post).

The mindset of On-premises and Cloud correlates with production or non-production operation of the company, usually companies that have on-premises does not have production environment (but some do have production). and most of the companies that use the cloud have a product in production environment (some don’t).

So the approach is production vs. non-production and mainly how to deploy a software fast and reliable to the compute resources in the cloud.

How to have a successful and easy migration?

Understand the approach is very different between cloud and on-premises (that’s why DevOps was invented).

Train or hire cloud engineers that understand DevOps approach.

Get a cloud architect to lead the cloud migration project (verify the cloud architect credentials and experience before hiring them!).

Understand that this kind of project is not cheap and will require a budget.

To Learn more about cloud migration you can download the cloud-migration PDF HERE

spinningops team had very successful cloud migration projects but we only onboard clients that understand DevOps approach, if you’re interested in cloud migration fill this FORM

Build and deploy Flask app to EKS

CI/CD DevOps Kubernetes April 19, 2023

Here’s an example of how to adopt DevOps approach in your development process.

Code your application

In this example we’ll use Flask to build a website and you can adopt this approach to any programming language

This app is simple Flask (video link bellow) with one route that returns index.html file.

Build your application

Once the current version is ready you can build the project, the pipeline should include build and tests.

If you’re using Static-typed language then you’ll need to compile the application before building the container image and then copy the artifact to the image.

For Dynamic-typed languages there no compile so it’s just building the container image using CI pipeline.

Deploy your application

To keep a software product updated and maintained, the deployment process should be on-demand and easy (click a button).

This code > test > deployment should be repeated for every code change.

What can be accomplished here?

By doing the work (coding) and implementing a deployment process the gap between development and production is resolved and the outcome is faster software delivery (CD) with reliable code (tests).

So DevOps is an approach and a way to build software.

Check this video tutorial for E2E build and deploy Flask website to Kubernetes cluster (EKS).

HOW TO create topic in Kafka 3.4.0 Kraft

blog nuggets DevOps Kafka March 27, 2023

Creating topic in Kafka 3.4.0 with kraft installed on EKS

This Kafka deployment is using node group that is configured with spot instances and persistent EBS volumes

This setup deployed with 3 pods

Cloud Native

CI/CD Continuous Delivery Deployments DevOps system architect March 14, 2023

How does your team is working with cloud infrastructure?

Cloud Native Topics

Development process
System design
Builds and packages
Deployments
Release
Cloud infrastructure

Development process

When working with cloud systems the proven methods to develop and run applications in production is DevOps

DevOps is the practice of code > build > test > deploy > release > repeat

DevOps is about bridging the gap between development to production

System design

You can use different methods of designing applications but if your applications are stored in the cloud then Microservices might be better to implement, otherwise why use on-premise approach when you can enable the benefits of using cloud infrastructure?

Builds CI and packages

Containers by now are the default approach to build applications as its comfortable to ship and deploy

Its easy to start local development environment using containers and work on your application

CI is an integral part as this will initiate your committed code to get build and get deployed

Deployments

Deployments to production is easier when the application is packaged as a container image

This is the next step after builds as this is the confirmation that the tests are ok and the container image with the latest code already pushed to the image repository and ready for deployment to production

Release

This is the step after the deployment is successful and the new image is used in production

Now the choice is when to enable new features using feature-toggle

Once the new features are enabled the release step is complete

Cloud Infrastructure

To manage the containers and successfully implement Microservices you’ll need Kubernetes cluster to orchestrate the containers runtime

Other services like sending email, databases, load balancers and more, can be integrated with your Kubernetes cluster to be used for the entire stack

Summary

Cloud Native is proven to have better results and happy developers

but Hey! you can always start a long running VM and install some stuff on it

How to deploy production systems

CI/CD Continuous Delivery Deployments DevOps March 2, 2023

In some situations development teams are deploying some of the production systems from local, yes from their local laptops. that’s is a not recommended practice as it causes issues in prod.

Why developers modify from their local laptop?

There can be a few resources for why deploying new code and modifying production config is done from local laptop, here are a few examples

No CI pipelines
Failed CI pipeline
Just bad practice

In any case it is not recommended to modify production systems from local laptop and it’s better to use CI tools.

How to modify production systems?

When you update production with new code/config or new services the downtime should be zero.

for a successful deployment to production you’ll need to adopt a few approaches

Git Ops

In summary this means that every modification will be committed to your Git repository.

Infrastructure as code

In summary this means that the creation and update of the infrastructure should be declared in code/template.

CI Pipeline

When a commit is made to the relevant repository Git sends a hook to the CI system to start the CI pipeline, this will initiate: build > test > deploy

when the CI pipeline is complete we know that the build, tests and deploy successfully as expected. (DevOps outcome should be expected for every pipeline)

Note: there’s another step of release that “enable” the actual new code, and this is done via feature toggle.

Deployment to production

This is where the deployment strategy comes, obviously force-deploy will shutdown the services and start the services again (downtime)

we’ll need a better approach like blue-green that will create another group of resources with the new code and only after its active the current requests/traffic will be redirected to the new group of services.

after the deploy is OK the old resources can be deleted.

Debugging

To verify that the deploy is OK

check your metrics and logs
check that the service is operational (can be done via automated QA)

Summary

Do not be tempted to deploy from local laptop as this will cause issues and it will not be registered in logs or as a commit.

Use CI pipeline!

DR for production systems

system architect February 15, 2023

If you clicked this blog post then you probably needed to implement or asked about DR in your production systems.

Let’s take a look at the options for DR in production.

Non-prod Vs. Prod

In order to provide a DR solution we’ll need to discuss the current systems status and to be exact, is the product deployed in production or ship to clients via download as a software package.

For non-prod that is basically a company that develop a software that is sent to clients and is installed on premises at the client’s data-center.

It can be self-managed with support SLA but the client is responsible for its infrastructure and operations.

For production it means that the software is deployed to a system that is actively handling requests from clients and must be online in order for users to use it.

Both scenarios are very different because in non-prod there’s no DevOps since the product is not in production, but if the product is in production then DevOps is needed in order to bridge the gap between development and production systems.

Passive DR

passive DR is the option of having a duplicate infrastructure and systems to be activated when needed.

The goal is to have minimum downtime and a 2nd location of data that is available when needed.

This approach is suitable for non-prod since there is no real need for active DR since the product is not in production.

Also when switching passive DR to active DR is takes time to check and test that every system is active and data is indeed replicated as expected so its a slow process.

Active DR

Active DR is the option of having a duplicate infrastructure and systems to be active in concurrently with the main production systems.

Basically its two sets of production running simultaneously, this is very suitable for production since the DR is active and is working in production.

Active DR can take the new load immediately and respond to it via auto-scaling.
Production load can be shared between the main site and the DR.
Active DR is used, meaning the costs are already paid in contrast to passive DR that is paid but not in use.

Summary

If your company needs a DR solution then you have two choices to pick from, review your stack and design of the systems and decide what is the best option for your product and consider both options.

Continuous Integration In Production

CI/CD February 1, 2023

What is CI? and How to implement it in production?

CI – Continuous Integration

CI is the practice of activating automated pipelines for testing and building processes.

The automated build includes the following:

Testing to verify code reliability and desired result from the developed feature or bug fixes.
Compiling the source code to an artifact.
Building the source code to a docker image.
Additional tests with other software components may also be included in the pipeline.

CI Feedback

When a developer commits code that is activating a CI pipeline the feedback is generated.

Based on that feedback (pipeline log output) the developer knows if there are issues or the pipeline succeeded and is ready for deployment to production (Continuous Deployment).

The log includes the various steps to ensure the code is reliable and software is doing what its designed to do.

CI Before CD

To ensure a successful deployment to production the CI pipelines are extremely important as it is the verification process for the development stage.

Integrating CI pipelines for the entire applications will ensure an expected result for every new code in development.

SpinningOps helps startups improve their system design, contact HERE and ask what can we do for your application.

Kafka In Production

DevOps Kafka January 25, 2023

Kafka is the main component of any system that implement Event-Driven-Architecture.

What do you need to know before deploying Kafka in production?

Kafka is low-latency component that act as the broker between the producers and the consumers.

Low-latency is extremely important and you’ll need to verify and monitor that the cluster is as follows:

Low-latency disk IO

Kafka write messages to disk (from producers) and messages are read from disk (consumers)
Kafka use data in sequential disk access that is very fast.
Zero Copy – this means Kafka copy data from local disk directly to the network interface.
Disk size per Kafka broker should probably be 6TB or more, depending on the use case.

RAM

RAM is extremely important as Kafka process uses Java HEAP.
Page Cache – is the main disk cache, the Linux Kernel uses page cache as buffer for reading and writing to disk.
- If there is memory available the page is kept in the cache without accessing the disk.
- this is extremely efficient and what makes Kafka so fast.

Network

High Throughput as Kafka broker the entire data between services.

What do you need to know after deploying Kafka in production?

Tuning and reassign partitions

Even after your cluster is working as expected in production you’ll need to tune it using new config options and reassign partitions.

Monitor

Monitor your cluster is extremely important as it reflects that actual status of the cluster, and since the cluster is the main component of Event-Driven-System the cluster should perform in micro-seconds and milliseconds.

Monitor will make your debugging much easier since Kafka metrics will display current status.

SpinningOps helps startups improve their system design, contact HERE and ask what can we do for your application.