DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Evaluating Your Event Streaming Needs the Software Architect Way
Watch Now

Tools

Development and programming tools are used to build frameworks, and they can be used for creating, debugging, and maintaining programs — and much more. The resources in this Zone cover topics such as compilers, database management systems, code editors, and other software tools and can help ensure engineers are writing clean code.

icon
Latest Refcards and Trend Reports
Trend Report
Kubernetes in the Enterprise
Kubernetes in the Enterprise
Refcard #366
Advanced Jenkins
Advanced Jenkins
Refcard #378
Apache Kafka Patterns and Anti-Patterns
Apache Kafka Patterns and Anti-Patterns

DZone's Featured Tools Resources

Fargate vs. Lambda: The Battle of the Future

Fargate vs. Lambda: The Battle of the Future

By William Talluri
Fargate vs. Lambda has recently been a trending topic in the serverless space. Fargate and Lambda are two popular serverless computing options available within the AWS ecosystem. While both tools offer serverless computing, they differ regarding use cases, operational boundaries, runtime resource allocations, price, and performance. This blog aims to take a deeper look into the Fargate vs. Lambda battle. What Is AWS Fargate? AWS Fargate is a serverless computing engine offered by Amazon that enables you to efficiently manage containers without the hassles of provisioning servers and the underlying infrastructure. When cluster capacity management, infrastructure management, patching, and provisioning resource tasks are removed, you can finally focus on delivering faster and better quality applications. AWS Fargate works with Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS), supporting a range of container use cases such as machine learning applications, microservices architecture apps, on-premise app migration to the cloud, and batch processing tasks. Without AWS Fargate Developers build container images Define EC2 instances and deploy them Provision memory and compute resources and manage them Create separate VMs to isolate applications Run and manage applications Run and manage the infrastructure Pay EC2 instances usage charges When AWS Fargate Is Implemented Developers build container images Define compute and memory resources Run and manage apps Pay compute resource usage charges In the Fargate vs. Lambda context, Fargate is the serverless compute option in AWS used when you already have containers for your application and simply want to orchestrate them easier and faster. It works with Elastic Kubernetes Service (EKS) as well as Elastic Container Service (ECS). EKS and ECS have two types of computing options: 1. EC2 type: With this option, you need to deal with the complexity of configuring Instances/Servers. This can be a challenge for inexperienced users. You must set up your EC2 instances and put containers inside the servers with some help from the ECS or EKS configurations. 2. Fargate type: This option allows you to reduce the server management burden while easily updating and increasing the configuration limits required to run Fargate. What Is Serverless? Before delving deep into the serverless computing battle of Lambda vs. Fargate or Fargate vs. Lambda, it’s important first to gain a basic understanding of the serverless concept. Serverless computing is a technology that enables developers to run applications without needing to provision server infrastructure. The cloud provider will provide the backend infrastructure on-demand and charge you according to a pay-as-you-go model. The term “serverless” might be misleading for some people. Indeed, it’s important to note that serverless technology doesn’t imply the absence of servers. Rather, the cloud provider will manage the server infrastructure with this technology, allowing developers to concentrate their efforts on an app’s front-end code and logic. Resources are spun when the code executes a function and terminates when the function stops. Billing is based on the duration of the execution time of the resources. Therefore, operational costs are optimized because you don’t pay for idle resources. With serverless technology, you can say goodbye to capacity planning, administrative burdens, and maintenance. Furthermore, you can enjoy high availability and disaster recovery at zero cost. Auto-scaling to zero is also available. Finally, resource utilization is 100%, and billing is done granularly, measuring 100 milliseconds as a unit. What Is AWS Lambda? AWS Lambda is an event-driven serverless computing service. Lambda runs predefined code in response to an event or action, enabling developers to perform serverless computing. This cross-platform was developed by Amazon and first released in 2014. It supports major programming languages such as C#, Python, Java, Ruby, Go, and Node.js. It also supports custom runtime. Some of the popular use cases of Lambda include updating a DynamoDB table, uploading data to S3 buckets, and running events in response to IoT sensor data. The pricing is based on milliseconds of usage, rounding off to the nearest millisecond. Moreover, Lambda allows you to manage Docker containers of sizes up to 50 GB via ECR. When you compare Fargate vs. Lambda, Fargate is for containerized applications running for days, weeks, or years. Lambda is designed specifically to handle small portions of an application, such as a function. For instance, a function that clears the cache every 6 hours and lasts for 30 seconds can be executed using Lambda. A Typical AWS Lambda Architecture AWS Lambda is a Function-as-a-Service (FaaS) that helps developers build event-driven apps. In the app’s compute layer, Lambda triggers AWS events. What are the three core components of Lambda architecture? 1) Function: A function is a piece of code written by developers to perform a task. The code also contains the details of the runtime environment of the function. The runtime environments are based on Amazon Linux AMI and contain all required libraries and packages. Capacity and maintenance are handled by AWS. a. Code Package: The packaged code containing the binaries and assets required for the code to run. The maximum size is 250 MB or 50 MB in a compressed version. b. Handler: The starting point of the invoked function running a task based on parameters provided by event objects. c. Event Object: A parameter provided to the Handler to perform the logic for an operation. d. Context Object: Facilitates interaction between the function code and the execution environment. The data available for Context Objects include: i. AWS Request ID ii. Remaining time for the function to time out iii. Logging statements to CloudWatch 2) Configuration: Rules that specify how a function is executed. a. IAM Roles: Assigns permissions for functions to interact with AWS services. b. Network Configuration: Specifies rules to run functions inside a VPC or outside a VPC. c. Version: Reverts functions to previous versions. d. Memory Dial: Controls resource allocations to functions. e. Environment Variables: Values injected into the code during the runtime. f. Timeout: Time for a function to run. 3) Event Source: The event that triggers the function. a. Push Model: Functions triggered via S3 objects, API Gateway and Amazon Alexa. b. Pull Model: Lambda pulls events from DynamoDB or Kinesis. A Typical AWS Fargate Architecture What are the four core components of the AWS Fargate architecture? 1) Task Definition: A JSON file that describes definitions for at least one of the application containers. 2) Task: Instantiation of a task definition at a cluster level. 3) Cluster: Tasks or services logically grouped in Amazon ECS. 4) Service: A process that runs tasks in Amazon ECS cluster based on task definitions. Fargate vs Lambda: Performance As far as performance is concerned in the AWS Fargate vs. Lambda debate, AWS Fargate is the winner, as it runs on dedicated resources. Lambda has certain limitations when it comes to allocating computing and memory resources. Based on the selected amount of RAM, AWS allocates the corresponding CPU resources meaning that the user cannot customize CPU resources. Moreover, the maximum available memory for Lambda functions is 10 GB, whereas Fargate allows for 120 GB of memory. Furthermore, Fargate allows you to choose up to 16 vCPU resources. Another notable issue is that a Lambda function only has a run time of 15 minutes for every invocation. On the other hand, in the absence of runtime limitations, the Fargate environment is always in a warm state. Fargate functions must be packaged into containers, increasing the load time to around 60 seconds. This is a very long time compared to Lambda functions which can get started within 5 seconds. Fargate allows you to launch 20 tasks per second using ECS RunTask API. Moreover, you can launch 500 tasks per service in 120 seconds with ECS Service Scheduler. That said, scaling the environment during unexpected spike requests and health monitoring tends to cause a bit of a delay in start-up time. Lambda Cold Starts When Lambda receives a request to execute a task, it starts by downloading the code from S3 buckets and creating an execution environment based on the predefined memory and its corresponding compute resources. If there is any initialization code, Lambda runs it outside the environment and then runs the handler code. The time required for downloading the code and preparing the execution environment is counted as the cold start duration. After executing the code, Lambda freezes the environment so that the same function can run quickly if invoked again. If you run the function concurrently, each invocation gets a cold start. There will also be a code start if the code is updated. The typical time for cold starts falls between 100 ms and 1 second. In light of the foregoing, Lambda falls short in the Lambda vs. Fargate race regarding cold starts. However, Provisioned Concurrency is a solution to reduce cold starts. The runtime choice will also have an impact on Lambda cold starts. For instance, Java runtime involves multiple resources to run the JVM environment, which delays the start. On the other hand, C# or Node.js runtime environments offer lower latencies. Fargate Cold Starts Fargate takes time to provision resources and starts a task. Once the environment is up and running, containers get dedicated resources and run the code as defined. Fargate vs. Lambda: Support AWS Fargate works as an operational layer of a serverless computing architecture to manage Docker-based ECS or Kubernetes-based EKS environments. For ECS, you can define container tasks in text files using JSON. There is support for other runtime environments as well. Fargate offers more capacity deployment control than Lambda, as Lambda is limited to 10GB of space and 10GB of package size for container images and 250 MB for deployments to S3 buckets. Lambda supports all major programming languages, such as Python, Go, Ruby, PHP, C#, Node.js, and Java, and code compilation tools, such as Maven and Gradle. That said, Lambda only supports Linux-based container images. With Fargate, you can develop Docker container images locally using Docker Compose and run them in Fargate without worrying about compatibility issues. Since development and architecture is independent of Fargate, it outperforms Lambda in this particular category. When more control over the container environment is the key requirement, AWS Fargate is definitely the right choice. Fargate vs. Lambda: Costs When comparing Fargate vs. Lambda costs, it is important to note that both tools serve different purposes. While Lambda is a Function-as-a-Service, Fargate is a serverless computing tool for container-based workloads. Lambda costs are billed in milliseconds. AWS Lambda charges $0.20 per 1 Million requests with $0.0000166667 for every GB-second duration for the first 6 Billion GB-seconds / month. The duration costs vary based on the allocated memory. For instance, 128 MB memory costs you $0.0000000021 per ms, and 10 GB memory costs you $0.0000001667 per ms. For example, consider 10 GB of memory with 6 vCPU and concurrency, which is always running. The monthly cost for the foregoing would be $432.50. If the concurrency is two, the price is doubled. If the environment runs half the day, the price gets divided by two. If it’s running for 10 minutes per day, the cost would be $9.10 per month. If you consider the same configuration in Fargate, the prices are drastically lower. Fargate charges a flat rate of $0.04048 per vCPU per hour ($29.145 per month) $0.004445 per GB per hour ($3.20 per month) So, a 10 GB memory with 6 vCPUs running continuously for a month with concurrency one would cost $206.87. Moreover, Fargate separates CPUs from memory, allowing you to choose the right-sized configuration. Therefore, you can save costs by reducing the CPUs depending on your needs. When you consider a concurrency of 10, the difference increases exponentially. Another advantage of Fargate is the spot pricing which offers an additional 30% savings. Notice that Lambda costs are lower than Fargate when the idle time is greater. In light of the foregoing, we can conclude that Lambda is more suitable for workloads that are idle for long periods. Lambda is cost-effective if the resources are idle for a quarter or less of the time. Lambda is the best choice to scale fast or isolate security from an app code. Contrastingly, Fargate suits cloud environments with minimally idle workloads. We think the best option is to implement Infrastructure as Code (IaC) and begin with Lambda. When workloads increase, you can seamlessly switch to Fargate. Fargate vs. Lambda: Easy to Work Lambda is easy to set up and operate as there are minimal knobs to adjust compared to Fargate. More abstraction implies less operational burden. However, it also implies limited flexibility. Lambda comes with a rich ecosystem that offers fully automated administration. You can use the management console or the API to call and control functions synchronously or asynchronously, including concurrency. The runtime supports a common set of functionalities and allows you to switch between different frameworks and languages. As far as operational burdens go, Lambda is easier compared to EC2. Fargate stands between Lambda and EC2 in this category, leaning closer towards Lambda. That said, EC2 offers more flexibility in configuring and operating the environment, followed by Fargate and Lambda. Fargate vs Lambda: Community Both AWS Fargate and Lambda are a part of the AWS serverless ecosystem. As such, both tools enjoy the same level of community support. Both services offer adequate support for new and advanced users, from documentation and how-to guides to tutorials and FAQs. Fargate vs Lambda: Cloud Agnostic Each cloud vendor manages serverless environments differently. For instance, C# functions written for AWS will not work on the Google Cloud. In light of the foregoing, developers need to consider cloud-agnostic issues if multi-cloud and hybrid-cloud architectures are involved. Moving between different cloud vendors involves considerable expenses and operational impacts. As such, vendor lock-in is a big challenge for serverless functions. To overcome this, we suggest using an open-source serverless framework offered by Serverless Inc. Moreover, implementing hexagonal architecture is a good idea because it allows you to move code between different serverless cloud environments. Fargate vs Lambda: Scalability In terms of Lambda vs Fargate scalability, Lambda is known as one of the best scaling technologies available in today’s market. Rapid scaling and scaling to zero are the two key strengths of Lambda. The tool instantly scales from zero to thousands and scales down from 1000 to 0, making it a good choice for low workloads, test environments, and workloads with unexpected traffic spikes. As far as Fargate is concerned, container scaling depends on resizing the underlying clusters. Furthermore, it doesn’t natively scale down to zero. Therefore, you’ll have to shut down Fargate tasks outside business hours to save on operational costs. Tasks such as configuring auto-scaling and updating base container images add up when it comes to maintenance. Fargate vs. Lambda: Security Lambda and Fargate are inherently secure as part of the AWS ecosystem. You can secure the environment using the AWS Identity and Access Management (IAM) service. Similarly, both tools abstract away the underlying infrastructure, which means the security of the infrastructure is managed by other services. The difference between the two tools lies in the IAM configuration. Lambda allows you to customize IAM roles for each function or service, while Fargate customizes each container and pod. Fargate tasks run in an isolated computing environment wherein CPU or memory is not shared with other tasks. Similarly, Lambda functions run in a dedicated execution environment. Also, Fargate offers more control over the environment and more secure touchpoints than Lambda. When to Use Fargate or Lambda? AWS Lambda Use Cases: Operating serverless websites Massively scaling operations Real-time processing of high volumes of data Predictive page rendering Scheduled events for every task and data backup Parse user input and cleanup backend data to increase a website’s rapid response time Analyzing log data on-demand Integrating with external services Converting documents into the user-requested format on-demand Real-Life Lambda Use Cases Serverless Websites: Bustle One of the best use cases for Lambda is operating serverless websites. By hosting frontend apps on S3 buckets and using CloudFront content delivery, organizations can manage static websites and take advantage of the Lambda pricing model. Bustle is a news, entertainment, and fashion website for women. The company was having difficulties scaling its application. In addition, server management, monitoring, and automation was becoming an important administrative burden. The company, therefore, decided to move to AWS Lambda with API Gateway and Amazon Kinesis to run serverless websites. Now, the company doesn’t have to worry about scaling, and its developers can deploy code at an extremely low cost. Event-driven Model for Workloads With Idle Times: Thomson Reuters Companies that manage workloads that are idle most of the time can benefit from the Lambda serverless feature. A notable example is Thomson Reuters, one of the world’s most trusted news organizations. The company wanted to build its own analytics engine. The small team working on this project desired a lessened administrative burden. At the same time, the tool needed to scale elastically during breaking news. Reuters chose Lambda. The tool receives data from Amazon Kinesis and automatically loads this data in a master dataset in an S3 bucket. Lambda is triggered with data integrations with Kinesis and S3. As such, Reuters enjoyed high scalability at the lowest cost possible. Highly Scalable Real-time Processing Environment: Realtor.com AWS Lambda enables organizations to scale resources while instantly cost-effectively processing tasks in real-time. Realtor.com is a leader in the real estate market. After the move to the digital world, the company started experiencing exponential traffic growth. Furthermore, the company needed a solution to update ad listings in real-time. Realtor.com chose AWS for its cloud operations. The company uses Amazon Kinesis Data Streams to collect and stream ad impressions. The internal billing system consumes this data using Amazon Kinesis Firehose, and the aggregate data is sent to the Amazon Redshift data warehouse for analysis. The application uses AWS Lambda to read Kinesis Data Streams and process each event. Realtor.com is now able to massively scale operations cost-effectively while making changes to ad listings in real-time. AWS Fargate Use Cases AWS Fargate is the best choice for managing container-based workloads with minimal idle times. Build, run, and manage APIs, microservices, and applications using containers to enjoy speed and immutability Highly scalable container-based data processing workloads Migrate legacy apps running on EC2 instances without refactoring or rearchitecting them Build and manage highly scalable AI and ML development environments Real-Life Use Cases Samsung Samsung is a leader in the electronics category. The company operates an online portal called “Samsung Developers,” which consists of SmartThings Portal for the Internet of Things (IoT), Bixby Portal for voice-based control of mobile services, and Rich Communication Services (RCS) for mobile messaging. The company was using Amazon ECS to manage the online portal. After the re: Invent 2017 event, Samsung was inspired to implement Fargate for operational efficiency. After migrating to AWS Fargate, the company no longer needed dedicated operators and administrators to manage the web services of the portal. Now, geographically distributed teams simply have to create new container images uploaded to ECR and moved to the test environment on Fargate. Developers can therefore focus more on code, and frequent deployments and administrators can focus more on performance and security. Compute costs were downsized by 44.5%. Quola Insurtech Startup Quola is a Jakarta-based insurance technology startup. The company developed software that automates claim processing using AI and ML algorithms to eliminate manual physical reviews. Quola chose AWS cloud and Fargate to run and manage container-based workloads. Amazon Simple Queue Service (SQS) is used for the message-queuing service. With Fargate, Quola is able to scale apps seamlessly. When a new partner joined the network, data transactions increased from 10,000 to 100,000 in a single day. Nevertheless, the app was able to scale instantly without performance being affected. Vanguard Financial Services Vanguard is a leading provider of financial services in the US. The company moved its on-premise operations to the AWS cloud in 2015 and now manages 1000 apps that run on microservices architecture. With security being a key requirement in the financial industry, Vanguard operates in the secure environment of Fargate. With Fargate, the company could offer seamless computing capacity to its containers and reduce costs by 50%. Considerations When Moving to a Serverless Architecture Inspired by the amazing benefits of serverless architecture, many businesses are aggressively embracing the serverless computing model. Here are the steps to migrate monolith and legacy apps to a serverless architecture. a) Monolith to Microservices: Most legacy apps are built using a monolith architecture. When such is the case, the first step is to break the large into smaller and modular microservices, after which each microservice will perform a specific task or function. b) Implement each Microservice as a REST API: The next step is identifying the best fit within these microservices. Implement each microservice as a REST API with API endpoints as resources. Amazon API Gateway is a fully managed service that can help you. c) Implement a Serverless Compute Engine: Implement a serverless compute engine such as Lambda or Fargate and move the business logic to the serverless tool such that AWS provisions resources every time a function is invoked. d) Staggered Deployment Strategy: Migrating microservices to the serverless architecture can be done in a staggered process. Identify the right services and then build, test, and deploy them. Continue this process to smoothly and seamlessly move the entire application to the new architecture. Considerations for Moving to Amazon Lambda Migrating legacy apps to Lambda is not a difficult job. If your application is written in any Lambda-supported languages, you can simply refactor the code and migrate the app to Lambda. You simply need to make some fundamental changes, such as changing the dependency on local storage to S3 or updating authentication modules. When Fargate vs. Lambda security is considered, Lambda has fewer touchpoints to secure than Fargate. If you are using Java runtime, keep in mind that the size of the runtime environment and resources can result in more cold starts than with Node.js or C#. Another key point to consider is memory allocation. Currently, Lambda’s maximum memory allocation is 3 GB. If your application requires more computing and memory resources, Fargate is a better choice. Considerations for Moving to AWS Fargate While AWS manages resource provisioning, customers still need to handle network security tasks. For instance, when a task is created, AWS creates an Elastic Network Interface (ENI) in the VPC and automatically attaches each task ENI to its corresponding subnet. Therefore, managing the connectivity between the ENI and its touch points is the customer’s sole responsibility. More specifically, you need to manage ENI access to AWS EC2, CloudWatch, Apps running on-premise or other regions, Egress, Ingress, etc. Moreover, audit and compliance aspects must be carefully managed, which is why Fargate is not preferred for highly regulated environments. Conclusion The Fargate vs Lambda battle is getting more and more interesting as the gap between container-based and serverless systems is getting smaller with every passing day. There is no silver bullet when deciding which service is the best. With the new ability to deploy Lambda functions as Docker container images, more organizations seem to lean towards Lambda. On the other hand, organizations that need more control over the container runtime environment are sticking with Fargate. More
How To Handle Secrets in Docker

How To Handle Secrets in Docker

By Keshav Malik
Secrets management in Docker is a critical security concern for any business. When using Docker containers, it is essential to keep sensitive data, such as passwords, API keys, and other credentials, secure. This article will discuss some best practices for managing secrets in Docker, including how to store them securely and minimize their exposure. We will explore multiple solutions: using Docker Secrets with Docker Swarm, Docker Compose, or Mozilla SOPS. Feel free to choose what’s more appropriate to your use case. But most importantly is to remember to never hard-code your Docker secrets in plain text in your Dockerfile! Following these guidelines ensures your organization’s sensitive information remains safe even when running containerized services. 4 Ways To Store and Manage Secrets in Docker 1. Using Docker Secrets and Docker Swarm Docker Secrets and Docker Swarm are two official and complimentary tools allowed to securely manage secrets when running containerized services. Docker Secrets provides a secure mechanism for storing and retrieving secrets from the system without exposing them in plain text. It enables users to keep their credentials safe by encrypting the data with a unique key before passing it to the system. Docker Swarm is a powerful tool for managing clusters of nodes for distributed applications. It provides an effective means of deploying containerized applications at scale. With this tool, you can easily manage multiple nodes within a cluster and automatically distribute workloads among them. This helps ensure your application has enough resources available at all times, even during peak usage periods or unexpected traffic spikes. Together, these two tools provide an effective way to ensure your organization’s sensitive information remains safe despite ever-evolving security needs. Let’s see how to create and manage an example secret. Creating a Secret To create a secret, we need to first initialize Docker Swarm. You can do so using the following command: docker swarm init Once the service is initialized, we can use the docker secret create command to create the secret: ssh-keygen -t rsa -b 4096 -N "" -f mykey docker secret create my_key mykey rm mykey In these commands, we first create an SSH key using the ssh-keygen command and write it to mykey. Then, we use the Docker secret command to generate the secret. Ensure you delete the mykey file to avoid any security risks. You can use the following command to confirm the secret is created successfully: docker secret ls We can now use this secret in our Docker containers. One way is to pass this secret with –secret flag when creating a service: docker service create --name mongodb --secret my_mongodb_secret redis:latest We can also pass this secret to the docker-compose.yml file. Let’s take a look at an example file: version: '3.7' services: myapp: image: mydummyapp:latest secrets: - my_secret volumes: - type: bind source: my_secret_key target: /run/secrets/my_secret read_only: true secrets: my_secret: external: true In the example compose file, the secrets section defines a secret named my_secret_key (discussed earlier). The myapp service definition specifies that it requires my_secret_key , and mounts it as a file at /run/secrets/my_secret in the container. 2. Using Docker Compose Docker Compose is a powerful tool for defining and running multi-container applications with Docker. A stack is defined by a docker-compose file allowing you to define and configure the services that make up your application, including their environment variables, networks, ports, and volumes. With Docker Compose, it is easy to set up an application in a single configuration file and deploy it quickly and consistently across multiple environments. Docker Compose provides an effective solution for managing secrets for organizations handling sensitive data such as passwords or API keys. You can read your secrets from an external file (like a TXT file). But be careful not to commit this file with your code: version: '3.7' services: myapp: image: myapp:latest secrets: - my_secret secrets: my_secret: file: ./my_secret.txt 3. Using a Sidecar Container A typical strategy for maintaining and storing secrets in a Docker environment is to use sidecar containers. Secrets can be sent to the main application container via the sidecar container, which can also operate a secrets manager or another secure service. Let’s understand this using a Hashicorp Vault sidecar for a MongoDB container: First, create a Docker Compose (docker-compose.yml) file with two services: mongo and secrets. In the secrets service, use an image containing your chosen secret management tool, such as a vault. Mount a volume from the secrets container to the mongo container so the mongo container can access the secrets stored in the secrets container. In the mongo service, use environment variables to set the credentials for the MongoDB database, and reference the secrets stored in the mounted volume. Here is the example compose file: version: '3.7' services: mongo: image: mongo volumes: - secrets:/run/secrets environment: MONGO_INITDB_ROOT_USERNAME_FILE: /run/secrets/mongo-root-username MONGO_INITDB_ROOT_PASSWORD_FILE: /run/secrets/mongo-root-password secrets: image: vault volumes: - ./secrets:/secrets command: ["vault", "server", "-dev", "-dev-root-token-id=myroot"] ports: - "8200:8200" volumes: secrets: 4. Using Mozilla SOPS Mozilla SOPS (Secrets Ops) is an open-source platform that provides organizations with a secure and automated way to manage encrypted secrets in files. It offers a range of features designed to help teams share secrets in code in a safe and practical way. The following assumes you are already familiar with SOPS, if that’s not the case, start here. Here is an example of how to use SOPS with docker-compose.yml: version: '3.7' services: myapp: image: myapp:latest environment: API_KEY: ${API_KEY} secrets: - mysecrets sops: image: mozilla/sops:latest command: ["sops", "--config", "/secrets/sops.yaml", "--decrypt", "/secrets/mysecrets.enc.yaml"] volumes: - ./secrets:/secrets environment: # Optional: specify the path to your PGP private key if you encrypted the file with PGP SOPS_PGP_PRIVATE_KEY: /secrets/myprivatekey.asc secrets: mysecrets: external: true In the above, the myapp service requires a secret called API_KEY. The secrets section uses a secret called mysecrets, which is expected to be stored in an external key/value store, such as Docker Swarm secrets or HashiCorp Vault. The sops service uses the official SOPS Docker image to decrypt the mysecrets.enc.yaml file, which is stored in the local ./secrets directory. The decrypted secrets are mounted to the myapp service as environment variables. Note: Make sure to create the secrets directory and add the encrypted mysecrets.enc.yaml file and the sops.yaml configuration file (with SOPS configuration) in that directory. Scan for Secrets in Your Docker Images Hard coding secrets in Docker is a significant security risk, making them vulnerable to attackers. We have seen different best practices to avoid hard-coding secrets in plain text in your Docker images, but security doesn’t stop there. You Should Also Scan Your Images for Secrets All Dockerfiles start with a FROM directive that defines the base image. It’s important to understand when you use a base image, especially from a public registry like Docker Hub, you are pulling external code that may contain hardcoded secrets. More information is exposed than visible in your single Dockerfile. Indeed, it’s possible to retrieve a plain text secret hard-coded in a previous layer starting from your image. In fact, many public Docker images are concerned: in 2021, we estimated that **7% of the Docker Hub images contained at least one secret.** Fortunately, you can easily detect them with ggshield (GitGuardian CLI). For example: ggshield secret scan docker ubuntu:22.04 Conclusion Managing secrets in Docker is a crucial part of preserving the security of your containerized apps. Docker includes several built-in tools for maintaining secrets, such as Docker Secrets and Docker Compose files. Additionally, organizations can use third-party solutions, like HashiCorp Vault and Mozilla SOPS, to manage secrets in Docker. These technologies offer extra capabilities, like access control, encryption, and audit logging, to strengthen the security of your secret management. Finally, finding and limiting accidental or unintended exposure of sensitive information is crucial to handling secrets in Docker. Companies are invited to use secret scanning tools, such as GitGuardian, to scan the Docker images built in their CI/CD pipelines as mitigation to prevent supply-chain attacks. If you want to know more about Docker security, we also summarized some of the best practices in a cheat sheet. More
The Power of Docker Images: A Comprehensive Guide to Building From Scratch
The Power of Docker Images: A Comprehensive Guide to Building From Scratch
By Ruchita Varma
What Is Advertised Kafka Address?
What Is Advertised Kafka Address?
By Christina Lin CORE
Deploying Prometheus and Grafana as Applications using ArgoCD — Including Dashboards
Deploying Prometheus and Grafana as Applications using ArgoCD — Including Dashboards
By lidor ettinger
Building a REST API With AWS Gateway and Python
Building a REST API With AWS Gateway and Python

AWS Gateway is a powerful tool for building APIs that scale to meet the demands of modern web and mobile applications. With AWS Gateway, you can create RESTful APIs that expose your data and business logic to developers who can then build rich, interactive applications that consume your API. REST API is an industry standard for building scalable, distributed web applications. With AWS Gateway, you can easily build a REST API that supports both GET and POST methods, as well as complex query parameters. You can also add support for other HTTP methods, such as PUT, DELETE, and HEAD. Using AWS Gateway, you can quickly create APIs that are secure and robust. You can also use it to deploy your code to a production environment with minimal effort. Additionally, AWS Gateway allows for seamless integration with other AWS services, such as S3 and DynamoDB, enabling you to easily add complex functionality to your APIs. Pre-requisites Before building a RESTful API with AWS Gateway, you should have the following in place: Create an AWS account if you don’t have one already. Log in to the AWS Management Console and navigate to the Amazon API Gateway service. Click on "Create API" and select "REST API". Click on "Actions" to define the resource and click "Create Method". Choose the HTTP verb (e.g. GET, POST, PUT, etc.) and click on the checkmark to create the method. In the "Integration type" section, select "Lambda Function" and enter the name of the Lambda function you want to use to handle the API requests. Click on "Save" to create the API. Select Node from the Runtime Dropdown. Code Example Python import json # Example data data = { "items": [ {"id": 1, "name": "Item 1", "price": 10.99}, {"id": 2, "name": "Item 2", "price": 15.99}, {"id": 3, "name": "Item 3", "price": 20.99}, ] } def lambda_handler(event, context): # Determine the HTTP method of the request http_method = event["httpMethod"] # Handle GET request if http_method == "GET": # Return the data in the response response = { "statusCode": 200, "body": json.dumps(data) } return response # Handle POST request elif http_method == "POST": # Retrieve the request's body and parse it as JSON body = json.loads(event["body"]) # Add the received data to the example data data["items"].append(body) # Return the updated data in the response response = { "statusCode": 200, "body": json.dumps(data) } return response # Handle PUT request elif http_method == "PUT": # Retrieve the request's body and parse it as JSON body = json.loads(event["body"]) # Update the example data with the received data for item in data["items"]: if item["id"] == body["id"]: item.update(body) break # Return the updated data in the response response = { "statusCode": 200, "body": json.dumps(data) } return response # Handle DELETE request elif http_method == "DELETE": # Retrieve the request's body and parse it as JSON body = json.loads(event["body"]) # Find the item with the specified id in the example data for i, item in enumerate(data["items"]): if item["id"] == body["id"]: # Remove the item from the example data del data["items"][i] break # Return the updated data in the response response = { "statusCode": 200, "body": json.dumps(data) } return response else: # Return an error message for unsupported methods response = { "statusCode": 405, "body": json.dumps({"error": "Method not allowed"}) } return response This code defines a Lambda function, lambda_handler, that handles different types of HTTP requests (GET, POST, PUT, DELETE) on some data. The data is an object containing an array of items, each item has an id, name, and price. When the function is called, it first determines the HTTP method of the request from the event object. Then it handles the request accordingly: GET: returns the data in the response with a status code of 200. POST: retrieves the request's body and parses it as JSON, then add the received data to the example data, then returns the updated data in the response with a status code of 200. PUT: retrieves the request's body and parses it as JSON, then updates the example data with the received data, then returns the updated data in the response with a status code of 200. DELETE: retrieves the request's body and parses it as JSON, then find the item with the specified id in the example data and removes it, then returns the updated data in the response with a status code of 200. If the method is not supported, it will return an error message with a status code of 405. Deploy the API by clicking on "Actions" and selecting "Deploy API". Select a deployment stage (e.g. "prod" or "test") and click on "Deploy".Use the generated API endpoint to make requests to your API. Running and Testing the Code in Postman Now, our API is up and running. You can send a test HTTP request through Postman. By sending a request to your invoke URL, you should see a 200 OK status code. For this test, no request body is needed for the incoming request.

By Derric Gilling CORE
Data Stream Using Apache Kafka and Camel Application
Data Stream Using Apache Kafka and Camel Application

Apache Kafka is an event streaming platform that was developed by LinkedIn and later made open-source under the Apache Software Foundation. Its primary function is to handle high-volume real-time data streams and provide a scalable and fault-tolerant architecture for creating data pipelines, streaming applications, and microservices. Kafka employs a publish-subscribe messaging model, in which data is sorted into topics, and publishers send messages to those topics. Subscribers can then receive those messages in real time. The platform offers a scalable and fault-tolerant architecture by spreading data across multiple nodes and replicating data across multiple brokers. This guarantees that data is consistently available, even if a node fails. Kafka's architecture is based on several essential components, including brokers, producers, consumers, and topics. Brokers manage the message queues and handle message persistence, while producers and consumers are responsible for publishing and subscribing to Kafka topics, respectively. Topics function as the communication channels through which messages are sent and received. Kafka also provides an extensive range of APIs and tools to manage data streams and build real-time applications. Kafka Connect, one of its most popular tools and APIs, enables the creation of data pipelines that integrate with other systems. Kafka Streams, on the other hand, allows developers to build streaming applications using a high-level API. In summary, Kafka is a robust and adaptable platform that can be used to construct real-time data pipelines and streaming applications. It has been widely adopted in various sectors, including finance, healthcare, e-commerce, and more. To create a Kafka data stream using Camel, you can use the Camel-Kafka component, which is already included in Apache Camel. Below are the steps to follow for creating a Kafka data stream using Camel: Prepare a Kafka broker and create a topic for the data stream. Set up a new Camel project on your IDE and include the required Camel dependencies, including the Camel-Kafka component. Create a new Camel route within your project that defines the data stream. The route should use the Kafka component and specify the topic to which the data should be sent or received. Select the appropriate data format for the data stream. For instance, if you want to send JSON data, use the Jackson data format to serialize and deserialize the data. Launch the Camel context and the Kafka producer or consumer to start sending or receiving data. Overall, using the Camel-Kafka component with Apache Camel is a simple way to create data streams between applications and a Kafka cluster. Here is the code for reading Table form DB and writing to Kafka cluster: Apache Camel Producer Application: Java import org.apache.camel.builder.RouteBuilder; import org.apache.camel.component.kafka.KafkaConstants; import org.springframework.stereotype.Component; @Component public class OracleDBToKafkaRouteBuilder extends RouteBuilder { @Override public void configure() throws Exception { // Configure Oracle DB endpoint String oracleDBEndpoint = "jdbc:oracle:thin:@localhost:1521:orcl"; String oracleDBUser = "username"; String oracleDBPassword = "password"; String oracleDBTable = "mytable"; String selectQuery = "SELECT * FROM " + oracleDBTable; // Configure Kafka endpoint String kafkaEndpoint = "kafka:my-topic?brokers=localhost:9092"; String kafkaSerializer = "org.apache.kafka.common.serialization.StringSerializer"; from("timer:oracleDBPoller?period=5000") // Read from Oracle DB .to("jdbc:" + oracleDBEndpoint + "?user=" + oracleDBUser + "&password=" + oracleDBPassword) .setBody(simple(selectQuery)) .split(body()) // Serialize to Kafka .setHeader(KafkaConstants.KEY, simple("${body.id}")) .marshal().string(kafkaSerializer) .to(kafkaEndpoint); } } Here is the code for reading Kafka Topic and writing the Oracle DB table: Apache Camel Camel Application; Java import org.apache.camel.builder.RouteBuilder; import org.apache.camel.component.kafka.KafkaConstants; import org.springframework.stereotype.Component; @Component public class KafkaToOracleDBRouteBuilder extends RouteBuilder { @Override public void configure() throws Exception { // Configure Kafka endpoint String kafkaEndpoint = "kafka:my-topic?brokers=localhost:9092"; String kafkaDeserializer = "org.apache.kafka.common.serialization.StringDeserializer"; // Configure Oracle DB endpoint String oracleDBEndpoint = "jdbc:oracle:thin:@localhost:1521:orcl"; String oracleDBUser = "username"; String oracleDBPassword = "password"; String oracleDBTable = "mytable"; from(kafkaEndpoint) // Deserialize from Kafka .unmarshal().string(kafkaDeserializer) .split(body().tokenize("\n")) // Write to Oracle DB .to("jdbc:" + oracleDBEndpoint + "?user=" + oracleDBUser + "&password=" + oracleDBPassword) .setBody(simple("INSERT INTO " + oracleDBTable + " VALUES(${body})")) .to("jdbc:" + oracleDBEndpoint + "?user=" + oracleDBUser + "&password=" + oracleDBPassword); } }

By Kiran Peddireddy
What Is Docker Swarm?
What Is Docker Swarm?

Docker Swarm: Simplifying Container Orchestration In recent years, containers have become an increasingly popular way to package, distribute, and deploy software applications. They offer several advantages over traditional virtual machines, including faster start-up times, improved resource utilization, and greater flexibility. However, managing containers at scale can be challenging, especially when running large, distributed applications. This is where container orchestration tools come into play, and Docker Swarm is one of the most popular options available. What Is Docker Swarm? Docker Swarm is a container orchestration tool that allows you to deploy and manage a cluster of Docker nodes. Each node is a machine that hosts one or more Docker containers, and together, they form a swarm. Docker Swarm provides a simple and intuitive interface for managing and monitoring your containers, making it an ideal tool for large-scale container deployments. Docker Swarm makes it easy to deploy and manage containerized applications across multiple hosts. It provides features such as load balancing, automatic service discovery, and fault tolerance. With Docker Swarm, you can easily scale your applications up or down by adding or removing Docker nodes from the cluster, making it easy to handle changes in traffic or resource usage. How Does Docker Swarm Work? Docker Swarm allows you to deploy and manage a cluster of Docker nodes. The nodes are machines that host one or more Docker containers, and they work together to form a swarm. When you deploy an application to Docker Swarm, you define a set of services that make up the application. Each service consists of one or more containers that perform a specific function. For example, you might have a service that runs a web server and another service that runs a database. Docker Swarm automatically distributes the containers across the nodes in the swarm, ensuring that each service is running on the appropriate nodes. It also provides load balancing and service discovery, making it easy to access your applications from outside the swarm. Docker Swarm uses a leader-follower model to manage the nodes in the swarm. The leader node is responsible for managing the overall state of the swarm and coordinating the activities of the follower nodes. The follower nodes are responsible for running the containers and executing the tasks assigned to them by the leader node. Docker Swarm is built on top of the Docker Engine, which is the core component of the Docker platform. The Docker Engine runs on each node in the swarm and manages the lifecycle of containers running on that node. When you deploy an application to a Docker Swarm, you define a set of services that make up the application. Each service consists of one or more containers that perform a specific function. For example, you might have a service that runs a web server and another service that runs a database. Docker Swarm automatically distributes the containers across the nodes in the swarm, ensuring that each service is running on the appropriate nodes. It also provides load balancing and service discovery, making it easy to access your applications from outside the swarm. Docker Swarm provides several features that make it easy to manage containers at scale, including: Load Balancing Docker Swarm automatically distributes incoming traffic across the nodes running the containers in the swarm, ensuring that each container receives a fair share of the traffic. Docker Swarm provides built-in load balancing to distribute traffic evenly across containers in a cluster. This helps to ensure that each container receives an equal share of the workload and prevents any single container from becoming overloaded. Automatic Service Discovery Docker Swarm automatically updates a DNS server with the IP addresses of containers running in the swarm. This makes it easy to access your containers using a simple domain name, even as the containers move around the swarm. Docker Swarm automatically assigns unique DNS names to containers, making it easy to discover and connect to services running within the swarm. This feature simplifies the management of large, complex, containerized applications. Fault Tolerance Docker Swarm automatically detects when a container fails and automatically restarts it on another node in the swarm. This ensures that your applications remain available even if individual containers or nodes fail. Scaling Docker Swarm makes it easy to scale your applications up or down by adding or removing nodes from the swarm. This makes it easy to handle changes in traffic or resource usage. Docker Swarm enables easy scaling of containerized applications. As your application traffic grows, you can add more nodes to the cluster, and Docker Swarm automatically distributes the containers across the new nodes. Rolling Updates Docker Swarm allows for rolling updates, where you can update containers without disrupting the application’s availability. This is achieved by updating containers one at a time while other containers continue to handle the traffic. Security Docker Swarm provides built-in security features to help protect your containerized applications. For example, it supports mutual TLS encryption for securing communication between nodes in the cluster. Ease of Use Docker Swarm is designed to be easy to use, with a simple API and command-line interface that makes it easy to deploy and manage containerized applications. High Availability Docker Swarm is designed to provide high availability for containerized applications. It automatically distributes containers across multiple nodes in a cluster and provides fault tolerance so that even if a node or container fails, the application remains available. Overall, Docker Swarm provides a range of powerful features that make it an ideal choice for managing containers at scale. With its support for high availability, scalability, load balancing, service discovery, rolling updates, security, and ease of use, Docker Swarm simplifies the management of containerized applications, allowing you to focus on delivering value to your customers. Benefits of Docker Swarm Docker Swarm offers several benefits for organizations that are deploying containerized applications at scale. These include: Simplified Management Docker Swarm provides a simple and intuitive interface for managing containers at scale. This makes it easy to deploy, monitor, and scale your applications. High Availability Docker Swarm provides built-in fault tolerance, ensuring that your applications remain available even if individual containers or nodes fail. Scalability Docker Swarm makes it easy to scale your applications up or down by adding or removing nodes from the swarm. This makes it easy to handle changes in traffic or resource usage. Compatibility Docker Swarm is fully compatible with the Docker platform, making it easy to use alongside other Docker tools and services. Portability Docker Swarm allows you to easily deploy and manage containerized applications across different environments, including on-premises and in the cloud. This helps to ensure that your applications can be easily moved and scaled as needed, providing flexibility and agility for your business. Conclusion Docker Swarm is a powerful tool for managing containers at scale. It provides a simple and intuitive interface for deploying and managing containerized applications across multiple hosts while also providing features such as load balancing, automatic service discovery, and fault tolerance. Docker Swarm is a very powerful tool for anyone looking to deploy and manage containerized applications at scale. It provides a simple and intuitive interface for managing a cluster of Docker nodes, allowing you to easily deploy and manage services across multiple hosts. With features such as load balancing, service discovery, and fault tolerance, Docker Swarm makes it easy to run containerized applications in production environments. If you’re using Docker for containerization, Docker Swarm is definitely worth checking out.

By Aditya Bhuyan
Host Hack Attempt Detection Using ELK
Host Hack Attempt Detection Using ELK

What Is SIEM? SIEM stands for Security Information and Event Management. It is a software solution that provides real-time analysis of security alerts generated by network hardware and applications. SIEM collects log data from multiple sources such as network devices, servers, and applications, then correlates and analyzes this data to identify security threats. SIEM can help organizations improve their security posture by providing a centralized view of security events across the entire IT infrastructure. It allows security analysts to quickly identify and respond to security incidents and provides detailed reports for compliance purposes. Some of the key features of SIEM solutions include: Log collection and analysis Real-time event correlation and alerting User and entity behavior analytics Threat intelligence integration Compliance reporting SIEM is often used in conjunction with other security solutions, such as firewalls, intrusion detection systems, and antivirus software, to provide comprehensive security monitoring and incident response capabilities. What Is ELK? ELK is an acronym for a set of open-source software tools used for log management and analysis: Elasticsearch, Logstash, and Kibana. Elasticsearch is a distributed search and analytics engine that provides fast search and efficient storage of large volumes of data. It is designed to be scalable and can handle a large number of queries and indexing operations in real-time. Logstash is a data collection and processing tool that allows you to collect logs and other data from multiple sources, such as log files, syslog, and other data sources, and transform and enrich the data before sending it to Elasticsearch. Kibana is a web-based user interface that allows you to visualize and analyze data stored in Elasticsearch. It provides a range of interactive visualizations, such as line graphs, bar charts, and heatmaps, as well as features such as dashboards and alerts. Together, these three tools form a powerful platform for managing and analyzing logs and other types of data, commonly referred to as the ELK stack or Elastic stack. The ELK stack is widely used in IT operations, security monitoring, and business analytics to gain insights from large amounts of data. Ingesting SIEM Data to ELK Ingesting SIEM data into the ELK stack can be useful for organizations that want to combine the security event management capabilities of SIEM with the log management and analysis features of ELK. Here are the high-level steps to ingest SIEM data into ELK: Configure the SIEM to send log data to Logstash, which is part of the ELK stack. Create a Logstash configuration file that defines the input, filters, and output for the SIEM data. Start Logstash and verify that it is receiving and processing SIEM data correctly. Configure Elasticsearch to receive and store the SIEM data. Create Kibana visualizations and dashboards to display the SIEM data. Here is an example of a Logstash configuration file that receives Syslog messages from a SIEM and sends them to Elasticsearch: Python input { syslog { type => "syslog" port => 5514 } } filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" } add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } } } output { elasticsearch { hosts => ["localhost:9200"] index => "siem" } } Once Logstash is configured and running, SIEM data will be ingested into Elasticsearch and can be visualized and analyzed in Kibana. It's important to ensure that the appropriate security measures are in place to protect the SIEM and ELK environments, and to monitor and alert on any security events. Detecting Host Hack Attempt Detecting host hack attempts using SIEM in ELK involves monitoring and analyzing system logs and network traffic to identify suspicious activity that may indicate a hack attempt. Here are the high-level steps to set up host hack attempt detection using SIEM in ELK: Configure the hosts to send system logs and network traffic to a centralized log collection system. Set up Logstash to receive and parse the logs and network traffic data from the hosts. Configure Elasticsearch to store the parsed log data. Use Kibana to analyze the log data and create dashboards and alerts to identify potential hack attempts. Here are some specific techniques that can be used to detect host hack attempts: Monitor for failed login attempts: Look for repeated failed login attempts from a single IP address, which may indicate a brute-force attack. Use Logstash to parse the system logs for failed login events and create a Kibana dashboard or alert to monitor for excessive failed login attempts. Monitor for suspicious network traffic: Look for network traffic to or from known malicious IP addresses or domains. Use Logstash to parse network traffic data and create a Kibana dashboard or alert to monitor for suspicious traffic patterns. Monitor for file system changes: Look for unauthorized changes to system files or settings. Use Logstash to parse file system change events and create a Kibana dashboard or alert to monitor for unauthorized changes. Monitor for suspicious process activity: Look for processes that are running with elevated privileges or that are performing unusual actions. Use Logstash to parse process events and create a Kibana dashboard or alert to monitor for suspicious process activity. By implementing these techniques and regularly monitoring the logs and network traffic, organizations can improve their ability to detect and respond to host hack attempts using SIEM in ELK. Configure Alert in ELK to Detect Host Hack Attempt To configure an alert in ELK to detect a host hack attempt, you can follow these general steps: Create a search query in Kibana that filters logs for Host Hack Attempt events. For example, you can use the following search query to detect failed login attempts: Python from elasticsearch import Elasticsearch es = Elasticsearch() search_query = { "query": { "bool": { "must": [ { "match": { "event.dataset": "auth" } }, { "match": { "event.action": "failed_login" } } ] } } } res = es.search(index="siem", body=search_query) for hit in res['hits']['hits']: print(hit['_source']) Once you have created your search query, save it as a Kibana saved search. Go to the Kibana Alerts and Actions interface and create a new alert. Choose the saved search you created in step 2 as the basis for the alert. Configure the alert to trigger when a certain threshold is met. For example, you can configure the alert to trigger when there are more than 5 failed login attempts within a 5-minute window. Configure the alert to send a notification, such as an email or Slack message, when it triggers. Test the alert to ensure that it is working as expected. Once the alert is configured, it will automatically trigger when it detects a Host Hack Attempt event, such as a failed login attempt. This can help organizations detect and respond to security threats efficiently and effectively. It is important to regularly review and update your alerts to ensure they are detecting the most relevant and important security events. Conclusion Using ELK to detect host hack attempts is an effective approach to enhance the security posture of an organization. ELK provides a powerful combination of log collection, parsing, storage, analysis, and alerting capabilities, which enable organizations to detect and respond to host hack attempts in real-time. By monitoring system logs and network traffic, and using advanced search queries and alerting mechanisms, ELK can help organizations detect a wide range of host hack attempts, including failed login attempts, suspicious network traffic, file system changes, and suspicious process activity. Implementing a robust host hack attempt detection strategy using ELK requires careful planning, configuration, and testing. However, with the right expertise and tools, organizations can create a comprehensive security monitoring system that provides real-time visibility into their network, improves incident response times, and helps prevent security breaches before they occur.

By Rama Krishna Panguluri
Apache Druid, TiDB, ClickHouse, or Apache Doris? Comparing the OLAP Tools We Have Used
Apache Druid, TiDB, ClickHouse, or Apache Doris? Comparing the OLAP Tools We Have Used

To brief you about me, I lead the Big Data team at NIO, an electric vehicle manufacturer. I have tried a fair share of the OLAP tools available on the market and here is what I think you need to know. Apache Druid Back in 2017, looking for an OLAP tool on the market was like seeking a tree on an African prairie—there were only a few of them. As we looked up and scanned the horizon, our eyes linger on Apache Druid and Apache Kylin. We landed on Druid because we were already been familiar with it, while Kylin, despite its impressively high query efficiency in pre-computation, had a few shortcomings: The best storage engine for Kylin would be HBase, but introducing HBase would bring in a whole new bunch of operation and maintenance burdens. Kylin pre-computes the dimensions and metrics, but the dimensional explosion coming along puts great pressure on storage. As for Druid, it used columnar storage, supported real-time and offline data ingestion, and delivered fast queries. On the flip side, it: Uses no standard protocols, such as JDBC, and thus was beginner-unfriendly. Had weak support for Join. Could be slow in exact deduplication and thus lowered down performance. Required huge maintenance efforts due to all the components with various installation methods and dependencies. Required changes in Hadoop integration and dependency of JAR packages when it came to data ingestion. TiDB We tried TiDB in 2019. Long story short, here are its pros and cons: Pros It was an OLTP + OLAP database that supported easy updates. It had the features we needed, including aggregate and breakdown queries, metric computation, and dashboarding. It supported standard SQL, so it was easy to grasp. It didn’t require too much maintenance. Cons The fact that TiFlash relied on OLTP could put more pressure on storage. As a non-independent OLAP, its analytical processing capability was less than ideal. Its performance varied among scenarios. ClickHouse vs. Apache Doris We did our research into ClickHouse and Apache Doris. We were impressed by ClickHouse’s awesome standalone performance, but stopped looking further into it when we found that: It did not give us what we wanted when it came to multi-table Join, which was kind of an important usage for us. It had relatively low concurrency. It could bring high operation and maintenance costs. Apache Doris, on the other hand, ticked a lot of the boxes on our requirement list: It supported high-concurrency queries, which was our biggest concern. It was capable of real-time and offline data processing. It supported aggregate and breakdown queries. Its unique model (a type of data model in Doris that ensured unique keys) supported updates. It could largely speed up queries via materialized view. It was compatible with MySQL protocol, so there was little trouble in development and adoption. Its query performance fills the bill. It only required simple O and M. To sum up, Apache Doris appeared to be an ideal substitute for Apache Druid + TiDB. Our Hands-On OLAP Experience Here is a diagram to show you how data flows through our OLAP system: Data Sources We pool data from our business system, event tracking, devices, and vehicles into our big data platform. Data Import We enable CDC for our business data. Any changes in such data will be converted into a data stream and stored in Kafka, ready for stream computing. As for data that can only be imported in batches, it will go directly into our distributed storage. Data Processing Instead of integrating streaming and batch processing, we adopted Lambda architecture. Our business status quo determines that our real-time and offline data come from different links. In particular: Some data comes in the form of streams. Some data can be stored in streams, while some historical data will not be stored in Kafka. Some scenarios require high data precision. To realize that, we have an offline pipeline that re-computes and refreshes all relevant data. Data Warehouse Instead of using the Flink/Spark-Doris Connector, we use the routine load method to transfer data from Flink to Doris, and broker load from Spark to Doris. Data produced in batches by Flink and Spark will be backed up to Hive for usage in other scenarios. This is our way to increase data efficiency. Data Services In terms of data services, we enable auto-generation of APIs through data source registration and flexible configuration so we can manage traffic and authority via APIs. In combination with the K8s serverless solution, the whole thing works great. Data Application In the data application layer, we have two types of scenarios: User-facing scenarios such as dashboards and metrics. Vehicle-oriented scenarios, where vehicle data is collected into Apache Doris for further processing. Even after aggregation, we still have a data size measured in billion but the overall computing performance is up to scratch. Our CDP Practice Like most companies, we build our own Customer Data Platform (CDP): Usually, a CDP is made up of a few modules: Tags: the building block, obviously. We have basic tags and customer behavior tags. We can also define other tags as we want. Groups: divide customers into groups based on the tags. Insights: characteristics of each customer group. Reach: ways to reach customers, including text messages, phone calls, APP notifications, and IM. Effect analysis: feedback about how the CDP runs. We wanted to achieve real-time + offline integration, fast grouping, quick aggregation, multi-table Join, and federated queries in our CDP. Here is how it is done: Real-Time + Offline We have real-time tags and offline tags and need them to be placed together. Plus, columns on the same data might be updated at different frequencies. Some basic tags (regarding the identity of customers) should be updated in real time, while other tags (age, gender) can be updated daily. We want to put all the atomic tags of customers in one table because that brings the least maintenance costs and can largely reduce the number of required tables when we add self-defined tags. So how do we achieve this? We use the routineload method of Apache Doris to update real-time data, and the broker load method to batch import offline data. We also use these two methods to update different columns in the same table, respectively. Fast Grouping Basically, grouping is to combine a certain group of tags and find the overlapping data. This can be complicated. Doris helped speed up this process by SIMD optimization. Quick Aggregation We need to update all the tags, re-compute the distribution of customer groups, and analyze effects on a daily basis. Such processing needs to be quick and neat. So we divide data into tablets based on time so there will be less data transfer and faster computation. When calculating the distribution of customer groups, we pre-aggregate data at each node and collect them for further aggregation. In addition, the vectorized execution engine of Doris is a real performance accelerator. Multi-Table Join Since our basic data is stored in multiple data tables, when CDP users customize the tags they need, they need to conduct multi-table Join. An important factor that attracted us to Apache Doris was its promising multi-table Join capability. Federated Queries Currently, we use Apache Doris in combination with TiDB. Records about customer reach will be put in TiDB, and data regarding credit points and vouchers will be processed in TiDB, too, since it is a better OLTP tool. As for more complicated analysis, such as monitoring the effectiveness of customer operation, we need to integrate information about task execution and target groups. This is when we conduct federated queries across Doris and TiDB. Conclusion This is our journey from Apache Druid, TiDB, and Apache Doris (and a short peek into ClickHouse in the middle). We looked into the performance, SQL semantics, system compatibility, and maintenance costs of each of them and ended up with the OLAP architecture we have now. If you have the same aspects of concern as us, this might be a reference for you.

By Huaidong Tang
Integrating AWS Secrets Manager With Spring Boot
Integrating AWS Secrets Manager With Spring Boot

In a microservices architecture, it’s common to have multiple services that need access to sensitive information, such as API keys, passwords, or certificates. Storing this sensitive information in code or configuration files is not secure because it’s easy for attackers to gain access to this information if they can access your source code or configuration files. To protect sensitive information, microservices often use a secrets management system, such as Amazon Secrets Manager, to securely store and manage this information. Secrets management systems provide a secure and centralized way to store and manage secrets, and they typically provide features such as encryption, access control, and auditing. Amazon Secrets Manager is a fully managed service that makes it easy to store and retrieve secrets, such as database credentials, API keys, and other sensitive information. It provides a secure and scalable way to store secrets, and integrates with other AWS services to enable secure access to these secrets from your applications and services. Some benefits of using Amazon Secrets Manager in your microservices include: Centralized management: You can store all your secrets in a central location, which makes it easier to manage and rotate them. Fine-grained access control: You can control who has access to your secrets, and use AWS Identity and Access Management (IAM) policies to grant or revoke access as needed. Automatic rotation: You can configure Amazon Secrets Manager to automatically rotate your secrets on a schedule, which reduces the risk of compromised secrets. Integration with other AWS services: You can use Amazon Secrets Manager to securely access secrets from other AWS services, such as Amazon RDS or AWS Lambda. Overall, using a secrets management system, like Amazon Secrets Manager, can help improve the security of your microservices by reducing the risk of sensitive information being exposed or compromised. In this article, we will discuss how you can define a secret in Amazon Secrets Manager and later pull it using the Spring Boot microservice. Creating the Secret To create a new secret in Amazon Secrets Manager, you can follow these steps: Open the Amazon Secrets Manager console by navigating to the “AWS Management Console,” selecting “Secrets Manager” from the list of services, and then clicking “Create secret” on the main page. Choose the type of secret you want to create: You can choose between “Credentials for RDS database” or “Other type of secrets.” If you select “Other type of secrets,” you will need to enter a custom name for your secret. Enter the secret details: The information you need to enter will depend on the type of secret you are creating. For example, if you are creating a database credential, you will need to enter the username and password for the database. Configure the encryption settings: By default, Amazon Secrets Manager uses AWS KMS to encrypt your secrets. You can choose to use the default KMS key or select a custom key. Define the secret permissions: You can define who can access the secret by adding one or more AWS Identity and Access Management (IAM) policies. Review and create the secret: Once you have entered all the required information, review your settings and click “Create secret” to create the secret. Alternatively, you can also create secrets programmatically using AWS SDK or CLI. Here’s an example of how you can create a new secret using the AWS CLI: Shell aws secretsmanager create-secret --name my-secret --secret-string '{"username": "myuser", "password": "mypassword"}' This command creates a new secret called “my-secret” with a JSON-formatted secret string containing a username and password. You can replace the secret string with any other JSON-formatted data you want to store as a secret. You can also create these secrets from your microservice as well: Add the AWS SDK for Java dependency to your project: You can do this by adding the following dependency to your pom.xml file: XML <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-secretsmanager</artifactId> <version>1.12.83</version> </dependency> Initialize the AWS Secrets Manager client: You can do this by adding the following code to your Spring Boot application’s configuration class: Java @Configuration public class AwsConfig { @Value("${aws.region}") private String awsRegion; @Bean public AWSSecretsManager awsSecretsManager() { return AWSSecretsManagerClientBuilder.standard() .withRegion(awsRegion) .build(); } } This code creates a new bean for the AWS Secrets Manager client and injects the AWS region from the application.properties file. Create a new secret: You can do this by adding the following code to your Spring Boot service class: Java @Autowired private AWSSecretsManager awsSecretsManager; public void createSecret(String secretName, String secretValue) { CreateSecretRequest request = new CreateSecretRequest() .withName(secretName) .withSecretString(secretValue); CreateSecretResult result = awsSecretsManager.createSecret(request); String arn = result.getARN(); System.out.println("Created secret with ARN: " + arn); } This code creates a new secret with the specified name and value. It uses the CreateSecretRequest class to specify the name and value of the secret and then calls the createSecret method of the AWS Secrets Manager client to create the secret. The method returns a CreateSecretResult object, which contains the ARN (Amazon Resource Name) of the newly created secret. These are just some basic steps to create secrets in Amazon Secrets Manager. Depending on your use case and requirements, there may be additional configuration or setup needed. Pulling the Secret Using Microservices Here are the complete steps for pulling a secret from the Amazon Secrets Manager using Spring Boot: First, you need to add the following dependencies to your Spring Boot project: XML <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-secretsmanager</artifactId> <version>1.12.37</version> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-core</artifactId> <version>1.12.37</version> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-aws</artifactId> <version>2.3.2.RELEASE</version> </dependency> Next, you need to configure the AWS credentials and region in your application.yml file: YAML aws: accessKey: <your-access-key> secretKey: <your-secret-key> region: <your-region> Create a configuration class for pulling the secret: Java import org.springframework.beans.factory.annotation.Autowired; import org.springframework.cloud.aws.secretsmanager.AwsSecretsManagerPropertySource; import org.springframework.context.annotation.Configuration; import com.amazonaws.services.secretsmanager.AWSSecretsManager; import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; import com.fasterxml.jackson.databind.ObjectMapper; @Configuration public class SecretsManagerPullConfig { @Autowired private AwsSecretsManagerPropertySource awsSecretsManagerPropertySource; public <T> T getSecret(String secretName, Class<T> valueType) throws Exception { AWSSecretsManager client = AWSSecretsManagerClientBuilder.defaultClient(); String secretId = awsSecretsManagerPropertySource.getProperty(secretName); GetSecretValueRequest getSecretValueRequest = new GetSecretValueRequest() .withSecretId(secretId); GetSecretValueResult getSecretValueResult = client.getSecretValue(getSecretValueRequest); String secretString = getSecretValueResult.getSecretString(); ObjectMapper objectMapper = new ObjectMapper(); return objectMapper.readValue(secretString, valueType); } } In your Spring Boot service, you can inject the SecretsManagerPullConfig class and call the getSecret method to retrieve the secret: Java import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; @Service public class MyService { @Autowired private SecretsManagerPullConfig secretsManagerPullConfig; public void myMethod() throws Exception { MySecrets mySecrets = secretsManagerPullConfig.getSecret("mySecrets", MySecrets.class); System.out.println(mySecrets.getUsername()); System.out.println(mySecrets.getPassword()); } } In the above example, MySecrets is a Java class that represents the structure of the secret in the Amazon Secrets Manager. The getSecret method returns an instance of MySecrets that contains the values of the secret. Note: The above code assumes the Spring Boot application is running on an EC2 instance with an IAM role that has permission to read the secret from the Amazon Secrets Manager. If you are running the application locally or on a different environment, you will need to provide AWS credentials with the necessary permissions to read the secret. Conclusion Amazon Secrets Manager is a secure and convenient way to store and manage secrets such as API keys, database credentials, and other sensitive information in the cloud. By using Amazon Secrets Manager, you can avoid hardcoding secrets in your Spring Boot application and, instead, retrieve them securely at runtime. This reduces the risk of exposing sensitive data in your code and makes it easier to manage secrets across different environments. Integrating Amazon Secrets Manager with Spring Boot is a straightforward process thanks to AWS SDK for Java. With just a few lines of code, you can create and retrieve secrets from Amazon Secrets Manager in your Spring Boot application. This allows you to build more secure and scalable applications that can be easily deployed to the cloud. Overall, Amazon Secrets Manager is a powerful tool that can help you manage your application secrets in a more secure and efficient way. By integrating it with Spring Boot, you can take advantage of its features and benefits without compromising on the performance or functionality of your application.

By Kushagra Shandilya
Building Micronaut Microservices Using MicrostarterCLI
Building Micronaut Microservices Using MicrostarterCLI

MicrostarterCLI is a rapid development tool. It helps you as a developer generate standard reusable codes, configurations, or patterns you need in your application. In a previous article, I went through a basic example of creating REST and GraphQL endpoints in a Micronaut application. This article demonstrates an example of bootstrapping a Micronaut microservices application using MicrostarterCLI. The application's architecture consists of the following: Fruit Service: A simple CRUD service for Fruit objects. Vegetable Service: A simple CRUD service for Vegetable objects. Eureka Service Discovery Server Consul Configuration Server. Spring Cloud Gateway. Set Up the Environment Download the MicrostarterCLI 2.5.0 or higher zip file. Then, unzip it and configure the folder in the environment variables. Then, you will be able to access the mc.jar, mc.bat, and mc from any folder in the operating system. To verify the configuration, check the Microstarter CLI by running the below command from the command prompt: PowerShell mc --version My environment details are as follows: Operating System Windows 11 Java Version 11 IDE IntelliJ Let's Start Development Step 0: Create a Workspace Directory Create a folder in your system in which you will save the projects. This step is optional, but it's helpful to organize the work. In this article, the workspace is c:\workspace. Step 1: Create Fruit Service The FruitService is a simple service with all CRUD operations to handle the Fruits objects. For simplicity, we will use the H2 database. We will use the MicrostarterCLI to generate all CRUD operation code and configuration as following steps: First, generate a Micronaut application from the Micronaut Launch and extract the project zip file in the c:\workspace folder. Alternatively, we can use the init Command of the MicrostarterCLI to generate the project directly from the Micronaut launch as follows: PowerShell mc init --name FruitService --package io.hashimati After running the init command, go to the FruitService directory. Then, run the entity command to add the required dependencies, configure the service, and generate the necessary CRUD services code. PowerShell cd fruit mc entity -e Fruit Once you run the command, the MicrostarterCLI will start with the configuration question.Answer the configuration question as follows. Option` Answer Description Is the application monolithic? no To specify if the service is monolithic or microservice Enter the server port number between 0 - 65535 -1 To set the server port. This service will use a random port number to run the services. Enter the service id: fruit-service To set the service ID. Select Reactive Framework reactor To user reactor framework in case the developer wants to user reactive data access. Do you want to use Lombok? yes To use the Lombok library to generate the entity class. Select Annotation: Micronaut Micronaut enables the developer to use Micronaut, JAX-RS, or Spring. This step instructs the MicrostarterCLI to use Micronaut annotation in generating the code. Enter the database name: fruits to set the database name. Select Database Type: H2 to specify the database management engine. Select Database Backend: JDBC To specify the database access Select Data Migration: liquibase To use liquibase as a data migration tool to create the database schema. Select Messaging type: none Do you want to add cache-caffeine support? yes To use cache Do you want to add Micrometer feature? yes To collect metrics on the endpoints and service calls. Select Distributed Tracing: Jaeger To use Jaeger for distributed tracing. Do you want to add GraphQL-Java-Tools support? yes To add GraphQL dependency to the project. Do you want to add GRPC support? no If the answer is "yes," MicrostarterCli will prepare the project to be ready for the GRPC services. In this example, will not user GRPC, Do you want to use File Services? no This article will not use storage services. Do you want to configure VIEWS? yes To confirm if you want to add a "view" dependency. Select the views configuration views-thymeleaf To add Thymeleaf dependency. Once you complete the configuration, the MicrostarterCLI will ask you to enter the collection's name or the table. Then, it will prompt you to enter the attribute. Enter the attributes as follows: Attribute Type validation FindBy()Method FindAllBy() method UpdateBy() Method Name String - Yes Yes No quantity Int - No Yes No By the end of the step, the MicrostarterCLI will generate the classes of the entity, service, controller, client, test controllers, liquibase xml files, and configurations. Step 2: Create Vegetable Service The vegetable service will host the CRUD service of the Vegetable objects. To define it, we will repeat the same steps of Step 1 as follows: Step 3: Create Eureka Server In this step, we will create the Eureka service discovery server. The service will be listening on port 8761. The Fruit and Vegetable services will register in the Eureka server as we point their Eureka client to localhost and port 8761. To create the Eureka server project using MicorstarterCLI, run the below command from the c:\workspace Shell mc eureka -version 2.7.8 --javaVersion 11 Step 4: Create a Gateway Service The last component we will create is Spring Cloud Gateway. The gateway service will listen on port 8080. To generate the Gateway project using the MicrostarterCLI, run the gateway command below: Shell mc gateway -version 2.7.8 --javaVersion 11 Step 5: Gateway/Microservice Routes Configurations In this step, we will configure the root routes for both Fruit and Vegetable APIs. As generated by the MIcrostarterCLI, the root route for the Fruit API, as in @Controller annotation of io.hashimati.controllers.FruitController a class of the Fruit Service. The Vegetable API's root route is /api/v1/vegetable as in the io.hashimati.controllers.VegetableController class of the Vegetable Service. To register the routes, we will use the register subcommand of the gateway command. To run the command go to c:\workspace\gateway and run the below command: mc gateway register When the command run, the MicrostarterCLI will prompt you to enter the Service ID, Service Name, and the routes. Run the register subcommand twice to configure the root routes for the Fruit and the Vegetable APIs. You configured the CRUD endpoints for the Fruit and Vegetable server from the Gateway by completing this step. Run and Try To run the service, we will create a run.bat file to launch all the services as below: cd c:\workspace\eureka\ start gradlew bootRun -x test& cd c:\workspace\FruitService start gradlew run -x test& cd c:\workspace\VegetableService start gradlew run -x test& cd c:\workspace\gateway start gradlew bootRun -x test& After running the run.bat file, all services start and run. wait until all the services complete their start-up process. Open this URL. You should see all services registered on the Eureka server. To try the CRUD services, you can use the .http file of IntelliJ. We will create test.http file as follows: HTTP POST http://localhost:8080/api/v1/fruit/save Content-Type: application/json { "name": "Apple", "quantity": 100 } ### GET http://localhost:8080/api/v1/fruit/findAll ### POST http://localhost:8080/api/v1/vegetable/save Content-Type: application/json { "name": "Onion", "quantity": 100 } ### GET http://localhost:8080/api/v1/vegetable/findAll By running from IntelliJ, It works! Conclusion Using MicostarterCLI, you can generate the configuration of the needed component for Microservice Archtication like JWT security, distributed tracing, messaging, and observability. MicrostarterCLI also supports Groovy and Kotlin programming languages. Please visit the project repository for more information. Check out the article example here. Happy coding!

By Ahmed Al-Hashmi
Kubernetes-Native Development With Quarkus and Eclipse JKube
Kubernetes-Native Development With Quarkus and Eclipse JKube

This article explains what Eclipse JKube Remote Development is and how it helps developers build Kubernetes-native applications with Quarkus. Introduction As mentioned in my previous article, microservices don’t exist in a vacuum. They typically communicate with other services, such as databases, message brokers, or other microservices. Because of this distributed nature, developers often struggle to develop (and test) individual microservices that are part of a larger system. The previous article examines some common inner-loop development cycle challenges and shows how Quarkus, combined with other technologies, can help solve some of the challenges. Eclipse JKube Remote Development was not one of the technologies mentioned because it did not exist when the article was written. Now that it does exist, it certainly deserves to be mentioned. What Is Eclipse JKube Remote Development? Eclipse JKube provides tools that help bring Java applications to Kubernetes and OpenShift. It is a collection of plugins and libraries for building container images and generating and deploying Kubernetes or OpenShift manifests. Eclipse JKube Remote Development is a preview feature first released as part of Eclipse JKube 1.10. This new feature is centered around Kubernetes, allowing developers the ability to run and debug Java applications from a local machine while connected to a Kubernetes cluster. It is logically similar to placing a local development machine inside a Kubernetes cluster. Requests from the cluster can flow into a local development machine, while outgoing requests can flow back onto the cluster. Remember this diagram from the first article using the Quarkus Superheroes? Figure 1: Local development environment logically inserted into a Kubernetes cluster. We previously used Skupper as a proxy to connect a Kubernetes cluster to a local machine. As part of the 1.10 release, Eclipse JKube removes the need to use Skupper or install any of its components on the Kubernetes cluster or your local machine. Eclipse JKube handles all the underlying communication to and from the Kubernetes cluster by mapping Kubernetes Service ports to and from the local machine. Eclipse JKube Remote Development and Quarkus The new Eclipse JKube Remote Development feature can make the Quarkus superheroes example very interesting. If we wanted to reproduce the scenario shown in Figure 1, all we’d have to do is re-configure the rest-fights application locally a little bit and then run it in Quarkus dev mode. First, deploy the Quarkus Superheroes to Kubernetes. Then, add the Eclipse JKube configuration into the <plugins> section in the rest-fights/pom.xml file: XML <plugin> <groupId>org.eclipse.jkube</groupId> <artifactId>openshift-maven-plugin</artifactId> <version>1.11.0</version> <configuration> <remoteDevelopment> <localServices> <localService> <serviceName>rest-fights</serviceName> <port>8082</port> </localService> </localServices> <remoteServices> <remoteService> <hostname>rest-heroes</hostname> <port>80</port> <localPort>8083</localPort> </remoteService> <remoteService> <hostname>rest-villains</hostname> <port>80</port> <localPort>8084</localPort> </remoteService> <remoteService> <hostname>apicurio</hostname> <port>8080</port> <localPort>8086</localPort> </remoteService> <remoteService> <hostname>fights-kafka</hostname> <port>9092</port> </remoteService> <remoteService> <hostname>otel-collector</hostname> <port>4317</port> </remoteService> </remoteServices> </remoteDevelopment> </configuration> </plugin> Version 1.11.0 of the openshift-maven-plugin was the latest version as of the writing of this article. You may want to check if there is a newer version available. This configuration tells OpenShift (or Kubernetes) to proxy requests going to the OpenShift Service named rest-fights on port 8082 to the local machine on the same port. Additionally, it forwards the local machine ports 8083, 8084, 8086, 9092, and 4317 back to the OpenShift cluster and binds them to various OpenShift Services. The code listing above uses the JKube OpenShift Maven Plugin. If you are using other Kubernetes variants, you could use the JKube Kubernetes Maven Plugin with the same configuration. If you are using Gradle, there is also a JKube OpenShift Gradle Plugin and JKube Kubernetes Gradle Plugin available. Now that the configuration is in place, you need to open two terminals in the rest-fights directory. In the first terminal, run ./mvnw oc:remote-dev to start the remote dev proxy service. Once that starts, move to the second terminal and run: Shell ./mvnw quarkus:dev \ -Dkafka.bootstrap.servers=PLAINTEXT://localhost:9092 \ -Dmp.messaging.connector.smallrye-kafka.apicurio.registry.url=http://localhost:8086 This command starts up a local instance of the rest-fights application in Quarkus dev mode. Requests from the cluster will come into your local machine. The local application will connect to other services back on the cluster, such as the rest-villains and rest-heroes applications, the Kafka broker, the Apicurio Registry instance, and the OpenTelemetry collector. With this configuration, Quarkus Dev Services will spin up a local MongoDB instance for the locally-running application, illustrating how you could combine local services with other services available on the remote cluster. You can do live code changes to the local application while requests flow through the Kubernetes cluster, down to your local machine, and back to the cluster. You could even enable continuous testing while you make local changes to ensure your changes do not break anything. The main difference between Quarkus Remote Development and Eclipse JKube Remote Development is that, with Quarkus Remote Development, the application is running in the remote Kubernetes cluster. Local changes are synchronized between the local machine and the remote environment. With JKube Remote Development, the application runs on the local machine, and traffic flows from the cluster into the local machine and back out to the cluster. Wrap-Up As you can see, Eclipse JKube Remote Development compliments the Quarkus Developer Joy story quite well. It allows you to easily combine the power of Quarkus with Kubernetes to help create a better developer experience, whether local, distributed, or somewhere in between.

By Eric Deandrea
Cypress vs. Puppeteer: A Detailed Comparison
Cypress vs. Puppeteer: A Detailed Comparison

The availability of various tools in the market has often kept you thinking about which tool is appropriate for testing the web application. It is important to test the web application to ensure that it functions as per the user’s requirement and gives a high-end user experience. End-to-end testing is an approach that is designed to ensure the functionality of the applications by automating the browsers to run the scenario of particular actions made by end users. To accomplish this, Cypress and Puppeteer have commonly used tools, and their detailed comparison is the main focus of the blog. The use of Cypress has increased in the recent year for web automation testing addressing issues faced by modern web applications. Now, Puppeteer is also widely accepted for web automation testing. This triggered debate on Cypress vs. Puppeteer. To have a piece of good information on the testing tools and Cypress vs. Puppeteer’s detailed comparison is crucial. Let’s get started with discussing the overview of Cypress and Puppeteer What Is Cypress? Cypress is an open-source automation testing tool based on JavaScript solution, mainly used for modern web automation. The front-end testing framework helps us write the test cases in de-factor web language for the web application. It gives the option for testing with respect to unit tests and integration tests, including significance like easy reporting, test configuration, and many more. It also supports the Mocha test framework. The working of Cypress is different from other testing tools. For example, when you need to run a script inside the browser, it is mainly executed in the same loop as that of your application. However, when its execution needs to be done outside the browser for the same scripts, it leverages the Node.js server to support it. Features of Cypress Some of the exciting features of Cypress that you should know are as follows: · It takes snips of snapshots during the running of the tests. · It allows real-time and fast debugging by using tools like Developer Tools. · It has automated waiting, so you do not have to add waits or sleep to the running tests. · You can verify and manage the behavior of the functions, timers, and server response. · It effortlessly controls, test, and stub the edge cases without involving the servers. What Is Puppeteer? It is an open-source node js library-based framework used for automation testing and web scraping tools. It gives high-level API to control Chromium and Chrome, which runs headless by default. Puppeteer is very easy to use by the testers as it is based on the DevTools Protocol, which is similar to the one used by the Chrome Developer Tools. You need to be aware of the Chrome Developer Tools to ensure running with Puppeteer quickly. Cypress vs. Puppeteer The comparison between the Cypress vs. Puppeteer is made below based on highlighting aspects that will help you get a clear picture. Brief A puppeteer is a tool which is developed by Google which works for automating Chrome with the use of DevTool protocol. However, Cypress is developed by Semiconductors which is a test runner-open source. The main difference between Puppeteer and Cypress is based on their work. Puppeteer is basically allowing browser automation based on node library, whereas Cypress is purely a test automation framework allowing End-to-End Testing, Integration Testing, and Unit Testing. It could better be understood that Puppeteer is not a framework but just a chromium version of the node version, which provides browser automation for Chrome and Chromium. The running is executed headless by default which can be further configured to run full Chromium or Chrome. In addition, Puppeteer is such a tool that provides a high level of API for controlling Chrome and Chromium over the DevTool protocol. Relating this to Cypress, it is mainly a front-end testing tool that is built for the modern web. Lastly, Puppeteer is free to use, whereas Cypress comes with both free and paid versions. Language With respect to the testing in the programming language, both Cypress and Puppeteer are based on JavaScript language. This gives you an easy option to work on both tools. Types of Testing Comparing the testing done by Cypress and Puppeteer, Cypress gives you wider options. For example, if you are looking the testing an entire application, Puppeteer cannot be the best option. It is basically great for web scrapping and crawling SPA. However, Cypress is a tool through which you can do End-to-end tests, Unit tests, and integration tests, and it can test anything that runs in a browser. Uses Puppeteer is mainly used for automating UI testing, mouse and keyboard movement, and others. It basically tests the application developed in Angularjs and Angular. Like Cypress, it is not considered an automation tool but rather manages the internal aspects of the chromium browser. It is a development tool that is able to perform tasks by developers, like locating elements and handling requests and responses. Architecture Cypress and Puppeteer differ in their architecture. Generally, most of the testing tools work by running outside of the browser, which is executed remote commands across the network. Cypress-testing tools operate inside the browsers which execute the test codes. It allows Cypress to listen and verify the browser performance at run time by modifying DOM and altering network requests and responses on the fly. It does not require any of the driver binaries. It runs on a NodeJS server which associates with the test runner manipulated by Cypress to operate the application and test code which is another iframe in a similar event loop. The supported browser of Cypress includes Canary, Chromium, Microsoft Edge, Mozilla Firefox browsers, and electron. Relating to the Puppeteer architecture follows the DevTools protocol as mentioned above. It manages the Chromium and chrome browser with the aid of high-quality API given by the Node library. The browser platform executes the action on the browser engine with and without headless mode. Followed to this, all the test execution is done in Chromium which is a real place. Other browsers, like Microsoft edge, make use of Chromium as a browser engine. It is regarded as the package which is based on a node module and hence known as Nodejs level. With the use of JavaScript, the development of automation code is done by the end user. Testing Speed Comparing the testing speed of the Puppeteer and Cypress, Puppeteer is regarded to be much faster than Cypress. In using Cypress, the test scripts are executed in the browser, where you need to click on a particular button. This will not send the command to involve a specific driver to the browser but rather utilizes DOM events to send the click command to the button. However, Puppeteer has great control over the browser due to high-level API control over Chrome and Chromium. Further, it works with minimal settings, eliminates extras, and uses less space than Cypress, making them consume less memory and start faster. Cypress is slower when executing the run test in a larger application. The main reason is that it tends to take snips of the application state at a different point in time of the tests, which makes it take more time. However, such cases are not evident in the Puppeteer, which makes it faster than Cypress. Reliability and Flexibility Relating to the testing of the web application, Cypress can be more user-friendly and reliable in doing JavaScript framework for performing end-to-end testing as compared with Puppeteer. It is because Puppeteer is not a framework but just a chromium version of the node module. Nevertheless, Puppeteer can be a great option for quick testing; however, when we want to test the entire performance and functionality of application testing, it is better suggested to use a stronger tool like Cypress. The main reason is that Cypress has its individual assertion, but Puppeteer does not, and rather it is based on Mocha, Jasmine, or Jest frameworks. Further, Cypress has its individual IDE, and Puppeteer is dependent on the VS Code and Webstorm. In a nutshell, Puppeteer only supports Chromium engine-based browsers, whereas Cypress supports many different browsers, thus, making it more reliable and flexible. Testing Code Execution on the Client Side, Like the Web Browser Puppeteer and Cypress have aspects of the client side where they allow testing code execution on the client-like web browser. In Puppeteer, manual operation can be done in the browser, and it is easy to create a testing environment for the test to run directly. You have the option to test the front-end function and UI testing with the use of Puppeteer. Further, Cypress aims to test like anything that could be run in the browser and executed to build a high user experience. It tests the flow of the application from start to end according to the view of the user. It also works equally well on older servers for the pages and applications. Testing Behavior on the Server Side The major difference between Puppeteer and Cypress is related to the allowance of the testing behavior of the server-side code, whereas Puppeteer does not have such aspects. However, Cypress has the ability to test the back-end behavior, say, for example, with the use of cy.task() command. It gives the way to run the Node code. Through this, users can take actions crucial for the tests beyond the scope of Cypress. Test Recording Cypress comes with dashboards where you can be able to see the recorded tests and provide details on the events which happen during the execution. However, Puppeteer does not have such a dashboard, making it unable to record the test. Hence, transparency in the execution of the test is not maintained in Puppeteer. Fixtures Fixtures are the specific and fixed states of data that are test locals. This helps confirms a particular environment for a single test. Comparing the two, Cypress has the inbuilt fixtures abilities. With the use of the command cy.fixture(filePath), you can easily load a fixed set of data that is located in a file. However, Puppeteer does not have any such fixtures. Group Fixtures Group fixtures let to define particular and fixed states of data for a group of tests which helps ensures the environment for a given group of tests. For this also, Puppeteer does not have any such fixtures. At the same time, Cypress has the ability to create group fixtures with the use of the cy.fixture command. Conclusion The blog has presented detailed comparisons of the Puppeteer and Cypress, which gives you enough information for you to decide which tools will be best according to your test requirement. LambdaTest is a cloud-based automation testing platform with an online Cypress automation tool that writes simple Cypress automation testing and sees its actions. Using LambdaTest, you can also test your Puppeteer test scripts online. Both Cypress and Puppeteer come with their own advantages and limitations. It would be best if you decided which suits best to your tests.

By Nazneen Ahmad
Unlock the Power of Terragrunt’s Hierarchy
Unlock the Power of Terragrunt’s Hierarchy

Developers have many options to build a cloud environment using available tools, but today’s complex infrastructure involves numerous interconnected services and processes. To ensure smooth operation amidst daily changes in your production environment, it is crucial to use advanced tools to design and build an elastic cloud environment. These tools can simplify coding and prevent repetitive tasks for you and your team. Here are some tips that will simplify the code for you and prevent repetition from you and the rest of the team: Get to Know the Terraform Tool Terraform is a tool that enables developers to define the architecture of their infrastructure through a clear and readable file format known as HCL. This file format describes the elastic components of the infrastructure, such as VPCs, Security Groups, Load Balancers, and more. It provides a simple and concise way to define the topology of the development environment. Two Major Challenges When Designing Automated Cloud Environments The first challenge arises during the initial run, when there is little or no interdependence between the different resources. This is relatively easy to handle. The second challenge is more complex and occurs when new resources need to be developed, updated, or deleted in an existing cloud environment that is constantly changing. In a cloud environment, changes should be carefully planned and executed to minimize errors between teams and ensure a smooth and efficient implementation. To achieve this, simplification of the cloud environment is crucial to prevent code duplication among developers working on the same source code. How Terragrunt Solves This Problem: An In-Depth Look Terragrunt is a tool that enhances Terraform’s functionality by providing additional infrastructure management tools that help maintain a DRY (Don’t Repeat Yourself) code base, dressing up existing Terraform modules and managing the remote state of the cloud environment. An example where Terragrunt can be particularly useful is in a distributed cloud environment where multiple resources share the same values, such as subnets and security groups. Without proper management, developers may inadvertently duplicate these values, leading to errors and inconsistencies. Terragrunt helps prevent code duplication by allowing the parameters to be defined only once, ensuring there is no confusion about the shared parameters in the environment. To optimize performance, Terragrunt enforces specific development principles and restrictions on the organization of Terraform code: Terragrunt mandates the use of a hierarchical folder structure to maintain consistency and prevent errors in the Terraform code. Terragrunt promotes centralized file management for shared common variables, enabling the organization of code and changes in a single location. Optimizing Cloud Environments With Logical Organization Using Terragrunt Before using Terragrunt, it is essential to organize the logical environment by dividing it into smaller scopes through folders. This approach enables the reuse of resources and modules without the need for rewriting --> promoting efficiency and reducing the risk of errors. By organizing the workspace in this manner, Terragrunt allows for the import of entries from Terragrunt.hcl files located in other hierarchies within the cloud environment. This process avoids duplicating values required for different resources by using the “Include Block” or “Dependency Block” to import previously written values from other hierarchies in the environment. Efficient File Management in Shared Spaces for Improved Collaboration Terragrunt offers a powerful capability for sharing configuration files with ease. Much like Terraform, Terragrunt receives parameters to launch resources. However, unlike Terraform, Terragrunt allows to define these parameters at a higher level, so different resources within the environment can make use of the same parameters. For instance, if an environment is running in the us-east-1 region, defining the region value in the root directory allows any resource within the environment to inherit the value for its own use. This approach minimizes redundancy and ensures consistency throughout the environment. Terragrunt’s ability to define parameters at a higher level streamlines the configuration process and makes it easier to manage resources. For instance, consider this use case: YAML Infra VPC Terragrunt.hcl Backend Service-01 Terragrunt.hcl Service-02 Terragrunt.hcl Terragrunt.hcl Modules VPC Terragrunt.hcl Terragrunt.hcl Referring to the hierarchy, as shown above: Infra: This concept organizes our environment and its structure by arranging the infrastructure in a specific order. This order starts with the VPC and everything related to it, followed by the backend and its various service definitions, and so on. Modules: This concept connects us to a group of resources that we intend to utilize. For instance, if we have decided to use a VPC in our infrastructure, we would define its source artifactory and initialization parameters within the module scope. Similarly, if our backend includes a service like Kubernetes-dashboard, we would also define its source artifactory within the module scope, and so on. Terragrunt.hcl: This file serves as the Terragrunt configuration file, as previously explained. However, we are also using it to define common values for the environment. For instance, if Service-01 and Service-02 share some common parameters, then we would define these parameters at a higher level in the Terragrunt.hcl file, under the backend scope, which is located in the root folder of both services. Furthermore, we have created a Terragrunt.hcl file within the root directory. By doing so, we are consolidating common values that pertain to the all environment, and that other parts of the hierarchy do not need to be aware of. This approach allows us to propagate shared parameters downward in the hierarchy, enabling us to customize our environment without duplicating values. Conclusion Numerous organizations prioritize the aspect of quality as a fundamental objective that underpins the entire process. As such, it is crucial to ask the right questions in order to achieve this goal: How do we envision the design of our cloud environment, taking into account both the production and development environments? How can we model the environment in a way that minimizes the risk of errors in shared variables? How flexible is the environment when it comes to accommodating rapid changes? Based on our experience, we follow the following approach: Begin by outlining the problem on paper. Break down the problem into smaller, manageable modules. Develop each module separately, using elastic thinking that enables the module to be reused in additional use cases, either by ourselves or other developers. Abstract the implementation of the solution while avoiding code duplication, i.e., propagating shared variables for resources that have common values. Sharing our acquired knowledge with the community is a priority for us, and we are excited to offer our solutions to anyone interested in learning from them. You can easily access and gain insights into our methods, and even implement them yourself. As an additional resource, we have a Github repository that contains a multitude of examples for creating efficient and effective applications in a cloud environment using Terragrunt. You’re welcome to use this as a reference to help you develop your own solutions.

By lidor ettinger

Top Tools Experts

expert thumbnail

Bartłomiej Żyliński

Software Engineer,
SoftwareMill

I'm a Software Engineer with industry experience in designing and implementing complex applications and systems, mostly where it's not visible to users - at the backend. I'm a self-taught developer and a hands-on learner, constantly working towards expanding my knowledge further. I contribute to several open source projects, my main focus being sttp (where you can see my contributions on the project's Github). I appreciate the exchange ef technical know-how - which is expressed by my various publications found on Medium and DZone, and appearances at top tech conferences and meetups, including Devoxx Belgium. I enjoy exploring topics that combine software engineering and mathematics. In my free time, I like to read a good book.
expert thumbnail

Vishnu Vasudevan

Head of Product Engineering & Management,
Opsera

Vishnu is an experienced DevSecOps leader and a SAFe Agilist with a track record of building SaaS/PaaS containerized products and improving operational and financial results via Agile/DevSecOps and digital transformations. He has over 16+ years of experience working in Infrastructure, Cloud engineering and automations. Currently, he works as Director - Product engineering at Opsera, responsible for delivering SaaS products under Opsera and services for their customers by using advanced analytics, standing up DevSecOps products, creating and maintaining models, and onboarding new products. Previously, Vishnu worked in leading financial enterprises as a product manager and delivery leader, where he built enterprise PaaS and SaaS products for internal application engineering teams. He enjoys spending free time driving, mountaineering, travel, soccer, playing cricket & cooking.
expert thumbnail

Abhishek Gupta

Principal Developer Advocate,
AWS

I mostly work on open-source technologies including distributed data systems, Kubernetes and Go
expert thumbnail

Yitaek Hwang

Software Engineer,
NYDIG

‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎

The Latest Tools Topics

article thumbnail
Fearless Distroless
With the rise of Docker came a new focus for engineers: optimizing the build to reach the smallest image size possible. This post explores available options.
April 20, 2023
by Nicolas Fränkel CORE
· 1,661 Views · 2 Likes
article thumbnail
Kafka: The Basics
Kafka is a powerful tool for building streaming architectures. The article serves as an introduction to both the technology and associated data producers and consumers.
April 20, 2023
by Pavel Micka
· 1,910 Views · 1 Like
article thumbnail
AWS: Pushing Jakarta EE Full Platform Applications to the Cloud
In this article, readers will learn how to deploy more complex Jakarta EE applications as serverless services with AWS Fargate.
April 20, 2023
by Nicolas Duminil CORE
· 2,827 Views · 4 Likes
article thumbnail
Continuing Hello World
The way we teach programming and new platforms needs to be more engaging. This same lesson applies to SaaS startup gamification.
April 20, 2023
by Shai Almog CORE
· 2,749 Views · 2 Likes
article thumbnail
Jira Anti-Patterns
Jira Anti-Patterns: In this article, learn why they exist and how you can counter these impediments to Agile product development.
April 19, 2023
by Stefan Wolpers CORE
· 1,326 Views · 1 Like
article thumbnail
GitStream vs. Code Owners vs. GitHub Actions
Looking to streamline your GitHub pipeline but unsure where to start? Find out when to use gitStream vs. code owners vs. GitHub Actions.
April 19, 2023
by Dan Lines CORE
· 1,179 Views · 2 Likes
article thumbnail
Recording Badge Scans in Apache Pinot
Use a real-time analytics database to record NFC badge scans in part 1 of this 3-part project involving environmental data, real-time notifications, and more.
April 19, 2023
by David G. Simmons CORE
· 1,538 Views · 1 Like
article thumbnail
Apache Kafka in Java [Video Tutorials]: Architecture and Simple Consumer/Producer
Explore a video tutorial series regarding Apache Kafka in Java as you explore Kafka architecture, components, producer callbacks, and more.
April 19, 2023
by Ram N
· 1,763 Views · 1 Like
article thumbnail
How to Move System Databases to Different Locations in SQL Server on Linux
In this article, we will explain how to move the system databases to different locations in Ubuntu Linux.
April 19, 2023
by Nirali Shastri
· 1,670 Views · 2 Likes
article thumbnail
Stream File Uploads to S3 Object Storage and Save Money
Learn how to upload files directly to S3-compatible Object Storage from your Node application to improve availability and reduce costs.
April 19, 2023
by Austin Gil CORE
· 2,141 Views · 1 Like
article thumbnail
How We Built a 1% Website in 3 Days for €7
We built a website in three days, and two days after launch, we've got 128 unique website visitors. Learn how we did it and how you can reproduce it.
April 18, 2023
by Thomas Hansen CORE
· 1,445 Views · 3 Likes
article thumbnail
Apache Kafka for Data Consistency
Apache Kafka ensures data consistency across legacy batch, request-response mobile apps, and real-time streaming in a data mesh architecture.
April 18, 2023
by Kai Wähner CORE
· 1,769 Views · 1 Like
article thumbnail
Unlock the Mysteries of AWS Lambda Invocation: Asynchronous vs. Synchronous
Unlock the full potential of your AWS Lambda functions with this deep dive into the differences between asynchronous and synchronous invocations for maximum efficiency.
April 18, 2023
by Satrajit Basu CORE
· 2,540 Views · 1 Like
article thumbnail
OpenShift vs. Kubernetes: The Unfair Battle
In this article, we are going to be comparing OpenShift and Kubernetes, and let me tell you, the comparison is far from fair.
April 18, 2023
by Rahul Shivalkar
· 3,018 Views · 2 Likes
article thumbnail
Guide to Creating and Containerizing Native Images
In this article, we will learn how to turn Java applications into native images and then containerize them for further deployment in the cloud.
April 18, 2023
by Dmitry Chuyko
· 3,198 Views · 1 Like
article thumbnail
Start Playwright for Component Testing
This approach helps identify and fix issues early in the development process, leading to a more stable and reliable final product.
April 17, 2023
by Kailash Pathak [Cypress Ambassador]
· 2,764 Views · 2 Likes
article thumbnail
How to Query Your AWS S3 Bucket With Presto SQL
In this tutorial, the reader will learn how to query an S3-based data lake with Presto, the open-source SQL query engine.
April 17, 2023
by Rohan Pednekar
· 2,482 Views · 1 Like
article thumbnail
GitHub Exposed a Private SSH Key: What You Need to Know
Everyone has secrets leakage incidents from time to time, even massive players like GitHub. This is a good reminder we all need to stay vigilant.
April 15, 2023
by Dwayne McDaniel
· 4,229 Views · 1 Like
article thumbnail
Is Apache Kafka Providing Real Message Ordering?
We’re told that Apache Kafka preserves message ordering per topic/partition, but how true is that, and how close can we come to making it true?
April 14, 2023
by Francesco Tisiot
· 4,214 Views · 1 Like
article thumbnail
Improve AWS Security and Compliance With CDK-nag?
AWS Cloud Development Kit (AWS CDK) is a powerful tool that allows developers to define cloud infrastructure in code using familiar programming languages.
April 14, 2023
by Jeroen Reijn CORE
· 3,431 Views · 2 Likes
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • ...
  • Next

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: