Software design and architecture focus on the development decisions made to improve a system's overall structure and behavior in order to achieve essential qualities such as modifiability, availability, and security. The Zones in this category are available to help developers stay up to date on the latest software design and architecture trends and techniques.
Cloud architecture refers to how technologies and components are built in a cloud environment. A cloud environment comprises a network of servers that are located in various places globally, and each serves a specific purpose. With the growth of cloud computing and cloud-native development, modern development practices are constantly changing to adapt to this rapid evolution. This Zone offers the latest information on cloud architecture, covering topics such as builds and deployments to cloud-native environments, Kubernetes practices, cloud databases, hybrid and multi-cloud environments, cloud computing, and more!
Containers allow applications to run quicker across many different development environments, and a single container encapsulates everything needed to run an application. Container technologies have exploded in popularity in recent years, leading to diverse use cases as well as new and unexpected challenges. This Zone offers insights into how teams can solve these challenges through its coverage of container performance, Kubernetes, testing, container orchestration, microservices usage to build and deploy containers, and more.
Integration refers to the process of combining software parts (or subsystems) into one system. An integration framework is a lightweight utility that provides libraries and standardized methods to coordinate messaging among different technologies. As software connects the world in increasingly more complex ways, integration makes it all possible facilitating app-to-app communication. Learn more about this necessity for modern software development by keeping a pulse on the industry topics such as integrated development environments, API best practices, service-oriented architecture, enterprise service buses, communication architectures, integration testing, and more.
A microservices architecture is a development method for designing applications as modular services that seamlessly adapt to a highly scalable and dynamic environment. Microservices help solve complex issues such as speed and scalability, while also supporting continuous testing and delivery. This Zone will take you through breaking down the monolith step by step and designing a microservices architecture from scratch. Stay up to date on the industry's changes with topics such as container deployment, architectural design patterns, event-driven architecture, service meshes, and more.
Performance refers to how well an application conducts itself compared to an expected level of service. Today's environments are increasingly complex and typically involve loosely coupled architectures, making it difficult to pinpoint bottlenecks in your system. Whatever your performance troubles, this Zone has you covered with everything from root cause analysis, application monitoring, and log management to anomaly detection, observability, and performance testing.
The topic of security covers many different facets within the SDLC. From focusing on secure application design to designing systems to protect computers, data, and networks against potential attacks, it is clear that security should be top of mind for all developers. This Zone provides the latest information on application vulnerabilities, how to incorporate security earlier in your SDLC practices, data governance, and more.
Microservices and Containerization
According to our 2022 Microservices survey, 93% of our developer respondents work for an organization that runs microservices. This number is up from 74% when we asked this question in our 2021 Containers survey. With most organizations running microservices and leveraging containers, we no longer have to discuss the need to adopt these practices, but rather how to scale them to benefit organizations and development teams. So where do adoption and scaling practices of microservices and containers go from here? In DZone's 2022 Trend Report, Microservices and Containerization, our research and expert contributors dive into various cloud architecture practices, microservices orchestration techniques, security, and advice on design principles. The goal of this Trend Report is to explore the current state of microservices and containerized environments to help developers face the challenges of complex architectural patterns.
In the previous post, we discussed AWS S3 service and its various use cases. We then set up an AWS S3 bucket with configurations and access for our web application data storage requirements. We created a .NET6 WebAPI project and some basic wiring/configuration to allow our application to access S3. However, we still have to write application code that will allow our user to store notes data (files) in an S3 bucket and read the information from these files as well for processing (CRUD operations). In this post, we will look into how to perform these operations from our .NET application code, along with the use of AWS SDK for S3. If you haven’t already, I will suggest you read the previous post, as we will be building on that foundation. AWSSDK.S3 nuget Packages AWS SDK, available as nuget package, can simplify the code that is needed to interact with Amazon S3 service. We can add it to our solution using the package manager, as shown below: You can notice that SDK is very modular, and this structure helps us to import only the package we need for our requirements instead of importing a lot of non-related code. Domain Model We have a very simple domain model for our application. The main entity is Note which represents an individual note from a user. We also have another entity NoteSummary which, as the name implies, stores summary information about notes. Here are the model class for these entities: Storage Service Next step, to store and retrieve domain models to and from the S3 bucket, our application needs some code, and that service is defined as the following interface: Following is the S3NoteStorageService implementation for this interface: As you can see, this implementation uses the IAmazonS3 object, which is the abstraction provided by AWSSDK.S3 nuget package we added earlier. We’ll not go into the details of these methods' code. The code is self-explanatory, and you can check it from this GitHub repository. Also, I added the following line in the Program.cs file to enable Dependency injection of this service in our application: Next, let's focus on the Controller side to wire all this up. API Controller The following screenshot shows the NotesController code. As you can see that INotesStorageService is injected via controller injection, and we can now use this service in the various action methods as needed to interact with the S3 service. My plan is to later update this application with users' logins, but for the sake of simplicity, here I’ve hard-coded the user, so all the notes will be saved under this user for now. Let's see the code for getting a list of notes for a user: Again, the code here is self-explanatory and very typical controller code, and you can check the code of all the methods in detail from the GitHub repository. Testing the REST API Ok, with all these in place, I started the application and used the Postman client tool to test the application. Adding Note This is the request payload to add a note. Once executed, we shall have the following files created in the S3 bucket. So, here we have one file for the note itself and the other file for the summary. Try adding more notes using Postman and see more files will be created in your bucket: List Notes Next, we can test the API for List notes action as shown below: As you can see, we are getting data from all the notes. Delete Note In a similar way, we can test the delete operation by providing NoteID, as shown below: Get Note Here is the API call to get single Note details by providing a NoteId: You can check the source code from this git repository. Summary In this post, we covered the .NET application code for our notes application and how it uses AWSSDK to interact with S3. AWSDK simplifies our code by providing us abstractions for easy use in our application code. We build a very simple REST API that allows us to perform CRUD operations on our domain model. Our application can now benefit from all the great features of the S3 storage service, and we can easily integrate it with other AWS services for more advanced use cases. Let me know if you have some comments or questions. Till next time, Happy Coding.
“Set it and forget it” is the approach that most network teams follow with their authoritative Domain Name System (DNS). If the system is working and end-users find network connections to revenue-generating applications, services, and content, then administrators will generally say that you shouldn’t mess with success. Unfortunately, the reliability of DNS often causes us to take it for granted. It’s easy to write DNS off as a background service precisely because it performs so well. Yet this very “set it and forget it” strategy often creates blind spots for network teams by leaving performance and reliability issues undiagnosed. When those undiagnosed issues pile up or go unaddressed for a while, they can easily metastasize into a more significant network performance problem. The reality is that, like any machine or system, DNS requires the occasional tune-up. Even when it works well, specific DNS errors require attention so minor issues don’t flare up into something more consequential. I want to share a few pointers for network teams on what to look for when they’re troubleshooting DNS issues. Set Baseline DNS Metrics No two networks are configured alike. No two networks have the same performance profile. Every network has quirks and peculiarities that make it unique. That’s why knowing what’s “normal” for your network is important before diagnosing any issues. DNS data can give you a sense of average query volume over time. For most businesses, this is going to be a relatively stable number. There will probably be seasonal variations (especially in industries like retail), but these are usually predictable. Most businesses see gradual increases in query volume as their customer base or service volume grows, but this also generally follows a set pattern. It’s also important to look at the mix of query volume. Is most of your DNS traffic to a particular domain? How steady (or volatile) is the mix of DNS queries among various back-end resources? The answers to these questions will be different for every enterprise and may change based on network team decisions on issues like load balancing, product resourcing, and delivery costs. Monitor NXDOMAIN Responses NXDOMAIN responses are a clear indication that something’s wrong. It’s normal to return at least some NXDOMAINs for “fat finger” queries, standard redirect errors, and user-side issues that are likely outside of a network team’s control. NS1, an IBM Company’s recent Global DNS data report, shows that between 3-6% of DNS queries receive an NXDOMAIN response for one reason or another. Anything at or near that range is probably to be expected in a “normal” network setup. When you go over double digits, something bigger is probably happening. The nature of the pattern matters, though. A slow but steady increase in NXDOMAIN responses is probably a long-standing misconfiguration issue that mimics overall traffic volume. A sudden spike in NXDOMAINs could be either a localized (but highly impactful) misconfiguration or a DDoS attack. The key is to keep a steady eye on NXDOMAIN responses as a percentage of overall query volume. Deviation from the norm is usually a clear sign that something is not right — then it becomes a question of why it’s not right and how to fix it. In most cases, a deeper dive into the timing and characteristics of the abnormal uptick will provide clues about why it’s happening. NXDOMAIN responses aren’t always a bad thing. In fact, they could represent a potential business opportunity. If someone’s trying to query a domain or subdomain of yours and coming up empty, that could indicate that it’s a domain you should buy or start using. Watch Out for Exposure of Internal DNS Data One particularly concerning type of NXDOMAIN response is caused by misconfigurations that expose internal DNS zone and record data to the internet. Not only does this kind of misconfiguration weigh on performance by creating unnecessary query volume, but it’s also a significant security issue. Stale URL redirects are often the cause of exposed internal records. In the upheaval of a merger or acquisition, systems sometimes get pointed at properties that fade away or are repurposed for other uses. The systems are still publicly looking for the old connection but not finding the expected answer. The smaller the workload, the more likely it is to go unnoticed. Pay Attention to Geography If you set a standard baseline for where your traffic is coming from, it’s easier to discover anomalous DDoS attacks, misconfigurations, and even broader changes in usage patterns as they emerge. A sudden uptick in traffic to a specific regional server is a different kind of issue than a broader increase in overall query volume. Tracking your DNS data by geography helps identify the issue you’re facing and ultimately provides clues on how to deal with it. Check SERVFAILs for Misconfigured Alias Records Alias records are a frequent source of misconfigurations and deserve regular audits in their own right. I’ve found that an increase in SERVFAIL responses — whether a sudden spike or a gradual increase — can often be traced back to problems with alias records. NOERROR NODATA? Consider IPv6 NXDOMAIN responses are pretty straightforward — the record wasn’t found. Things get a little more nuanced when you see the response come back as NOERROR, but you also see that no answer was returned. While there’s no official RFC code for this situation, it’s usually known as a NOERROR NODATA response when the answer counter returns “0”. NOERROR NODATA means that the record was found, but it wasn’t the record type that was supposed to be there. If you’re seeing a lot of NOERROR NODATA responses, in our experience, the resolver is usually looking for an AAAA record. If you’ve got a lot of NOERROR NODATA responses, I’ve found that adding support for IPv6 usually fixes the problem. DNS Cardinality and Security Implications In the world of DNS, there are two types of cardinality to worry about. Resolver cardinality refers to the number of resolvers querying your DNS records. Query name cardinality refers to the number of different DNS names for which you receive queries each minute. Measuring DNS cardinality is important because it may indicate malicious activity. Specifically, an increase in DNS query name cardinality can indicate a random label attack or probing of your infrastructure at a mass level. An increase in resolver cardinality may indicate that you are being targeted with a botnet. If you suddenly see an increase in resolver cardinality, it’s likely an indication of some sort of attack. Conclusion These pointers should help you better understand the impact of DNS query behavior and some steps you can take to get your DNS to a healthy state. Feel free to comment below on any other tips you’ve learned throughout your career.
AWS Cloud Development Kit (AWS CDK) is a powerful tool that allows developers to define cloud infrastructure in code using familiar programming languages like TypeScript, Python, and Java. However, as with any infrastructure-as-code tool, it's important to ensure that the resulting infrastructure adheres to security and compliance best practices. This is where CDK-nag comes in. What Is CDK-nag? CDK-nag is an open-source tool that provides automated checks for AWS CDK code and the resulting Cloudformation templates to help ensure that they adhere to security and compliance best practices. After adding CDK-nag to your project, it checks for a variety of known security and compliance issues, including overly-permissive IAM policies, missing access logs, and unintended public s3 buckets. CDK-nag also checks for common mistakes that can lead to security vulnerabilities, such as the use of plain text passwords and the use of default security groups. The great thing about CDK-nag is that it allows you to catch mistakes at a very early stage in the process. Ideally, you can catch them while developing your infrastructure as code in CDK on your local machine. As an alternative, you can add CDK-nag to your CI/CD pipeline and make the build fail in case of any issues. Adding CDK-nag to Your Project Using CDK-nag is simple. First, add it as a dependency to your AWS CDK project. If you're using Java, you can add it to your pom.xml file. XML <dependency> <groupId>io.github.cdklabs</groupId> <artifactId>cdknag</artifactId> <version>2.25.2</version> </dependency> After you've added the dependency, you will need to explicitly enable CDK-nag utilizing a CDK aspect. You can apply CDK-nag in the scope of your entire CDK application or just in the scope of a single CDK stack. CDK-nag works with rules which are defined in packs. Those packs are based on the AWS Config conformance pack. If you've never looked at AWS Config, the Operational Best Practices for HIPAA Security page is a nice page to look at in the context of these CDK-nag conformance packs. By default, CDK-nag comes with several rule packs out of the box. Based on your requirements, you can enable one or more rule packs. Let's take a look at how to apply such a rule pack. Java public class AwsCdkNagDemoApp { public static void main(final String[] args) { App app = new App(); new AwsCdkNagDemoStack(app, "AwsCdkNagDemoStack", StackProps .builder() .env(Environment.builder() .account(System.getenv("CDK_DEFAULT_ACCOUNT")) .region(System.getenv("CDK_DEFAULT_REGION")) .build()) .build() ); Aspects.of(app) .add( AwsSolutionsChecks.Builder .create() .verbose(true) .build() ); app.synth(); } } As you can see in the above code fragment, we've enabled the AwsSolutionsChecks rules for the scope of the entire CDK app. In this example, we've explicitly enabled verbose mode as it will generate more descriptive messages. Now let's take a look at an example stack and see how CDK-nag responds to that. The stack below is a very simple stack that contains an AWS Lambda function processing messages from an SQS queue. Java public AwsCdkNagDemoStack(final Construct scope, final String id, final StackProps props) { super(scope, id, props); final Queue queue = Queue.Builder.create(this, "demo-queue") .visibilityTimeout(Duration.seconds(300)) .build(); final Function function = Function.Builder .create(this, "demo-function") .handler("com.jeroenreijn.demo.aws.cdknag.FunctionHandler") .code(Code.fromAsset("function.jar")) .runtime(Runtime.JAVA_11) .events(List.of( SqsEventSource.Builder.create(queue).build()) ) .build(); queue.grantConsumeMessages(function); } Analyzing Results Now when you run cdk synth from the command line, it will trigger CDK-nag, and it will automatically scan your resources in the resulting templates and check them for security and compliance issues. Once the scan is done, CDK-nag will either return successfully or return an error message and output a list of violations in a format that is easy to understand. After running, cdk synth we will get the following messages in our output. Plain Text [Error at /AwsCdkNagDemoStack/demo-queue/Resource] AwsSolutions-SQS3: The SQS queue is not used as a dead-letter queue (DLQ) and does not have a DLQ enabled. Using a DLQ helps maintain the queue flow and avoid losing data by detecting and mitigating failures and service disruptions on time. [Error at /AwsCdkNagDemoStack/demo-queue/Resource] AwsSolutions-SQS4: The SQS queue does not require requests to use SSL. Without HTTPS (TLS), a network-based attacker can eavesdrop on network traffic or manipulate it, using an attack such as man-in-the-middle. Allow only encrypted connections over HTTPS (TLS) using the aws:SecureTransport condition in the queue policy to force requests to use SSL. [Error at /AwsCdkNagDemoStack/demo-function/ServiceRole/Resource] AwsSolutions-IAM4[Policy::arn:<AWS::Partition>:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole]: The IAM user, role, or group uses AWS managed policies. An AWS managed policy is a standalone policy that is created and administered by AWS. Currently, many AWS managed policies do not restrict resource scope. Replace AWS managed policies with system specific (customer) managed policies. This is a granular rule that returns individual findings that can be suppressed with 'appliesTo'. The findings are in the format 'Policy::<policy>' for AWS managed policies. Example: appliesTo: ['Policy::arn:<AWS::Partition>:iam::aws:policy/foo']. Found errors As you can see, CDK-nag spotted some errors and explains what we can do to improve our infrastructure. Usually, it's quite easy to fix these errors. Level 2 CDK constructs already incorporate some of the best practices, so when using them, you will probably find fewer errors compared to using Level 1 constructs. The messages depend on the rule pack you select. For instance, when we switch to the HIPAASecurityChecks rule pack, we will get some duplicates but also some additional error messages. Plain Text [Error at /AwsCdkNagDemoStack/demo-function/Resource] HIPAA.Security-LambdaConcurrency: The Lambda function is not configured with function-level concurrent execution limits - (Control ID: 164.312(b)). Ensure that a Lambda function's concurrency high and low limits are established. This can assist in baselining the number of requests that your function is serving at any given time. [Error at /AwsCdkNagDemoStack/demo-function/Resource] HIPAA.Security-LambdaDLQ: The Lambda function is not configured with a dead-letter configuration - (Control ID: 164.312(b)). Notify the appropriate personnel through Amazon Simple Queue Service (Amazon SQS) or Amazon Simple Notification Service (Amazon SNS) when a function has failed. [Error at /AwsCdkNagDemoStack/demo-function/Resource] HIPAA.Security-LambdaInsideVPC: The Lambda function is not VPC enabled - (Control IDs: 164.308(a)(3)(i), 164.308(a)(4)(ii)(A), 164.308(a)(4)(ii)(C), 164.312(a)(1), 164.312(e)(1)). Because of their logical isolation, domains that reside within an Amazon VPC have an extra layer of security when compared to domains that use public endpoints. ... The HIPAASecurityChecks also finds issues related to Lambda function concurrency and running your Lambda function inside a VPC. As you can see, different packs look at different things, so it's worthwhile to explore the different packs and see how they can help you improve. It's worth mentioning that CDK-nag does not implement all rules defined in these AWS Config conformance packs. You can check which rules are excluded in the CDK-nag excluded rules documentation. Summary Overall, CDK-nag is a powerful tool for ensuring that your AWS CDK code and templates adhere to security and compliance best practices. By catching security issues early in the development process, CDK-nag can help you build more secure and reliable infrastructure. I've used it in many projects over the last couple of years, and it's adding value. Especially if you work in a team that does not have a lot of AWS experience, it shines. If you're using AWS CDK, I highly recommend giving CDK-nag a try. The example code in this post and a working project can be found on GitHub.
Cybersecurity for embedded devices, such as the Internet of Things (IoT) and other connected devices, is becoming increasingly important as these devices become more ubiquitous in our daily lives. The risks of the rising tide of security threats are significant. Beyond reputational damage, competitive threats, eroding customer confidence, and safety challenges, regulators are also paying increasing attention. Root of trust and certificate management are common for device identity and protection. Here are some critical other considerations for improving the cybersecurity of embedded devices. Secure Boot Implementing a secure boot process is a critical first step in securing an embedded device. Secure Boot ensures that only authorized firmware is loaded onto the device during the boot process, preventing the device from being compromised by unauthorized software. In addition, secure booting mechanisms, such as those based on Trusted Platform Module (TPM) technology and Public Key Infrastructure tree (PKI tree), can ensure that only trusted code is executed on the device. This can prevent malware or malicious code from being executed on the device. Encryption All data transmitted between the device and other systems should be encrypted to prevent eavesdropping and data theft. Encryption can help protect sensitive data stored on the device and transmitted over networks. Access Controls Limiting access to the device through proper access controls can help prevent unauthorized access and tampering. Measures include strong passwords, biometric authentication, multi-factor authentication, and limiting access to only authorized users. If application usage supports certificate-based authentication, it is best to avoid strong passwords. Also, disable unnecessary protocols, IP addresses, and ports. Code Sign Digitally signing executables and scripts helps confirm the software author and guarantee the code has not been altered or corrupted. A cryptographic hash is used to validate authenticity and integrity. Firmware Updates Regularly updating the firmware on an embedded device is critical for patching any known vulnerabilities. It is crucial to ensure that the firmware updates are secure and authenticated and that the device is not vulnerable to attacks during the update process. Hardware Security Hardware-level security features can also be implemented to provide additional layers of protection. These can include secure storage of sensitive data, hardware-based encryption, secure key storage, and disabling unused hardware ports like USB, COM, and JTAG. In addition, secure storage for keys provides integrity and confidentiality guarantees on data stored in persistent memory. Use Network Segmentation Network segmentation can help isolate the device from other devices on the network and prevent attackers from accessing other devices if the embedded device is compromised. Conduct Regular Security Assessments Regular security assessments can help identify vulnerabilities and weaknesses in the device's security, allowing them to be addressed before attackers can exploit them. Testing and Validation Regular testing and validation of the device's security measures are critical to ensuring the device remains secure over time. Some embedded security feature implementation is complex, and hardware dependency must be tested for potential implementation issues. This can include both automated and manual testing, as well as regular security audits and assessments. Conclusion The biggest challenge for embedded systems might be having no GUI or physical access. Small-footprint devices typically have far less battery power, processing speed, and memory than PCs or phones. Resource limitations and tool availability will limit some of these feature implementations. Developers should take protective measures based on their device's intended usage, interface, and connectivity measures. By implementing these and other cybersecurity measures, embedded devices can be better protected against the growing threats of cyberattacks and data breaches.
Hey there! So, have you ever heard of Microservices Architecture? It's a modern approach to building software systems that are flexible, scalable, and easy to maintain. In this blog post, we're going to give you the lowdown on what Microservices Architecture is, its benefits, and how Java can be a great fit for building microservices. First things first, Microservices Architecture is an approach where a software system is broken down into smaller, independent services that communicate with each other through APIs. Each service is responsible for a specific business function and can be developed, deployed, and scaled independently. This makes it easier to maintain and modify the system, as changes made to one service don't affect the entire system. Benefits of Microservices Architecture The microservices architecture provides several benefits over the traditional monolithic architecture, including: Scalability: Since each service is independent, it can be scaled horizontally to handle increased traffic or load without affecting the other services. Fault tolerance: In a monolithic architecture, a single failure can bring down the entire system. In contrast, microservices architecture is resilient to failure since the services are distributed, and each failure only affects the corresponding service. Faster time-to-market: Microservices enable faster development and deployment since each service can be developed, tested, and deployed independently without affecting other services. Improved flexibility: Microservices architecture enables easy integration with third-party services and the ability to use different technologies and languages for each service. How Java Is Suitable for Microservices Architecture Java is an ideal programming language for developing microservices architecture due to the following reasons: Robustness: Java is known for its reliability, stability, and performance, making it an excellent choice for developing microservices. Platform-independent: Java code can run on any platform or operating system without modification, making it highly portable. Wide range of frameworks: Java has a rich ecosystem of frameworks that provide powerful tools and features for building microservices, such as Spring Boot, Micronaut, Quarkus, and Jakarta EE. Understanding the Fundamentals of Microservices Architecture 1. Breaking Down Monoliths Into Microservices To create a microservices architecture, it is essential to break down monolithic applications into smaller and independent services. This process involves identifying the core functionalities and components of the monolithic application and separating them into individual services. By doing so, each service can have its own development cycle, deployment, and scaling, enabling faster innovation and improved agility. 2. Defining Microservices Boundaries Defining microservices boundaries involves determining the scope and responsibilities of each service. Each microservice should be responsible for a specific business capability or function. This helps to maintain a clear separation of concerns and enables teams to work independently without affecting other services. 3. Implementing Independent Services Communication Microservices need to communicate with each other to complete a task. In a microservices architecture, services communicate via APIs, which are well-defined contracts that specify how services interact with each other. Implementing independent service communication requires creating robust and reliable APIs that can handle different types of requests and responses. 4. Enforcing Service Isolation and Resiliency Service isolation is an important principle of microservices architecture. It involves ensuring that each service is independent and self-contained. This means that if one service fails, it should not impact the entire system. To enforce service isolation, it is essential to use techniques such as fault tolerance, circuit breakers, and bulkheads. These techniques ensure that if one service fails, the other services can continue to function without any issues. Java Tools and Technologies for Microservices Architecture Java provides various tools and technologies for implementing microservices architecture. Here are some of the popular Java frameworks that support microservices architecture: Spring Boot Spring Boot is an open-source Java-based framework that helps in building standalone, production-grade Spring-based applications with minimal configuration. How Spring Boot Supports Microservices: Spring Boot provides a set of features that makes it easy to develop and deploy microservices. It offers a variety of tools to quickly create and manage microservices, such as embedded servers, auto-configuration, and seamless integration with other Spring modules. Benefits of Using Spring Boot: Spring Boot simplifies the development process by reducing the amount of boilerplate code needed, providing a flexible configuration, and enabling developers to focus on the business logic. Dropwizard Dropwizard is a high-performance Java-based framework that helps in building RESTful web services with minimum configuration. How Dropwizard Supports Microservices: Dropwizard is a great choice for building microservices because of its ability to package a complete application into a single executable JAR file. It also provides powerful features for monitoring and managing microservices, such as health checks, metrics, and logging. Benefits of Using Dropwizard: Dropwizard simplifies the development process by providing a streamlined set of tools and configurations to build and deploy microservices. It also provides a comprehensive set of metrics and monitoring tools that help in managing the services effectively. Micronaut Micronaut is a lightweight Java-based framework that helps in building modular, easily testable microservices and serverless applications. How Micronaut Supports Microservices: Micronaut provides a variety of features to support microservices, including fast startup time, low memory footprint, and minimal configuration. It also includes built-in support for service discovery, load balancing, and circuit breaking. Benefits of Using Micronaut: Micronaut provides a highly efficient and scalable microservices development platform that enables developers to build and deploy microservices rapidly. It also has extensive documentation and a growing community, making it easy to get started and find help when needed. Implementing Microservices Architecture With Java Now that we have an understanding of the fundamentals of microservices architecture and the Java tools available for building microservices, it's time to dive into implementing microservices architecture with Java. 1. Designing Microservices Architecture With Java The first step in implementing microservices architecture with Java is designing the architecture itself. This involves breaking down the monolithic application into smaller, independent microservices and defining the boundaries between them. It's important to consider factors such as communication protocols, data storage, and service isolation. 2. Building Microservices With Java Once the architecture is designed, it's time to start building the microservices themselves using one of the Java tools we discussed earlier. This involves creating a new project for each microservice, defining its endpoints, and implementing its functionality. The use of a framework like Spring Boot can greatly simplify this process. 3. Testing Microservices With Java Testing is a crucial step in ensuring the reliability and functionality of the microservices. This involves creating unit tests for each microservice to ensure it's functioning correctly, as well as integration tests to ensure that the microservices are able to communicate with each other and operate as a cohesive system. 4. Deploying Microservices With Java Finally, the microservices need to be deployed to a production environment. This involves packaging each microservice into a container and using an orchestration tool like Kubernetes to manage and deploy the containers. It's important to consider factors such as scalability and reliability when deploying microservices to ensure they can handle increased traffic and remain stable under heavy loads. Best Practices for Creating Microservices Architecture With Java Implementing Continuous Integration and Continuous Deployment Automating the build, test, and deployment process to reduce the risk of human error. Ensuring version control is in place to easily manage changes and rollbacks. Using containerization technologies like Docker to improve the consistency and portability of microservices. Ensuring Service Isolation and Resiliency Designing microservices to be loosely coupled and independent of one another. Implementing fault-tolerance mechanisms like circuit breakers and retry policies to prevent cascading failures. Using distributed tracing to track the flow of requests across multiple microservices. Securing Microservices With Java Implementing authentication and authorization mechanisms to control access to microservices. Using encryption to secure communication between microservices. Using API gateways to protect microservices from malicious traffic and provide a single point of entry. Monitoring and Logging Microservices With Java Using tools like Prometheus and Grafana to monitor the performance of microservices. Using centralized logging tools like ELK stack to aggregate and analyze logs from multiple microservices. Implementing proactive monitoring to identify potential issues before they affect end-users. Conclusion In conclusion, creating a microservices architecture with Java can provide many benefits, such as improved scalability, flexibility, and modularity. With the use of Java frameworks and technologies like Spring Boot, Dropwizard, and Micronaut, developers can create efficient and reliable microservices. However, it's essential to follow best practices such as implementing continuous integration and deployment, ensuring service isolation and resiliency, securing microservices, and monitoring and logging microservices. By following these practices, Java development services can create high-quality microservices architecture that can meet the demands of modern software development.
Containerization has resulted in many businesses and organizations developing and deploying applications differently. A recent report by Gartner indicated that by 2022, more than 75% of global organizations would be running containerized applications in production, up from less than 30% in 2020. However, while containers come with many benefits, they certainly remain a source of cyberattack exposure if not appropriately secured. Previously, cybersecurity meant safeguarding a single "perimeter." By introducing new layers of complexity, containers have rendered this concept outdated. Containerized environments have many more abstraction levels, which necessitates using specific tools to interpret, monitor, and protect these new applications. What Is Container Security? Container security is using a set of tools and policies to protect containers from potential threats that will affect an application, infrastructure, system libraries, run time, and more. Container security involves implementing a secure environment for the container stack, which consists of the following: Container image Container engine Container runtime Registry Host Orchestrator Most software professionals automatically assume that Docker and Linux kernels are secure from malware, an easily overestimated assumption. Top 5 Container Security Best Practices 1. Host and OS Security Containers provide isolation from the host, although they both share kernel resources. Often overlooked, this aspect makes it more difficult but not impossible for an attacker to compromise the OS through a kernel exploit so they can gain root access to the host. Hosts that run your containers need to have their own set of security access in place by ensuring the underlying host operating system is up to date. For example, it is running the latest version of the container engine. Ideally, you will need to set up some monitoring to be alerted for any vulnerabilities on the host layer. Also, choose a "thin OS," which will speed up your application deployment and reduce the attack surface by removing unnecessary packages and keeping your OS as minimal as possible. Essentially, in a production environment, there is no need to let a human admin SSH to the host to apply any configuration changes. Instead, it would be best to manage all hosts through IaC with Ansible or Chef, for instance. This way, only the orchestrator can have ongoing access to run and stop containers. 2. Container Vulnerability Scans Regular vulnerability scans of your container or host should be carried out to detect and fix potential threats that hackers could use to access your infrastructure. Some container registries provide this kind of feature; when your image is pushed to the registry, it will automatically scan it for potential vulnerabilities. One way you can be proactive is to set up a vulnerability scan in your CI pipeline by adopting the "shift left" philosophy, which means you implement security early in your development cycle. Again, Trivy would be an excellent choice to achieve this. Suppose you were trying to set up this kind of scan to your on-premise nodes. In that case, Wazuh is a solid option that will log every event and verify them against multiple CVE (Common Vulnerabilities and Exposure) databases. 3. Container Registry Security Container registries provide a convenient and centralized way to store and distribute images. It is common to find organizations storing thousands of images in their registries. Since the registry is so important to the way a containerized environment works, it must be well protected. Therefore, investing time to monitor and prevent unauthorized access to your container registry is something you should consider. 4. Kubernetes Clusters Security Another action you can take is to re-enforce security around your container orchestration, such as preventing risks from over-privileged accounts or attacks over the network. Following the least-privileged access model, protecting pod-to-pod communications would limit the damage done by an attack. A tool that we would recommend in this case is Kube Hunter, which acts as a penetration testing tool. As such, it allows you to run a variety of tests on your Kubernetes cluster so you can start taking steps to improve security around it. You may also be interested in Kubescape, which is similar to Kube Hunter; it scans your Kubernetes cluster, YAML files, and HELM Charts to provide you with a risk score: 5. Secrets Security A container or Dockerfile should not contain any secrets. (certificate, passwords, tokens, API Keys, etc.) and still, we often see secrets hard-coded into the source code, images, or build process. Choosing a secret management solution will allow you to store secrets in a secure, centralized vault. Conclusion These are some of the proactive security measures you may take to protect your containerized environments. This is vital because Docker has only been around for a short period, which means its built-in management and security capabilities are still in their infancy. Thankfully, the good news is that achieving decent security in a containerized environment can be easily done with multiple tools, such as the ones we listed in the article.
API governance refers to the set of policies, procedures, and practices that organizations adopt to ensure the effective management and control of their Application Programming Interfaces (APIs). A well-designed API governance framework helps organizations to establish guidelines and best practices for developing, deploying, and managing APIs. It provides a structured approach to API development and helps ensure consistency in the APIs that are offered to internal and external stakeholders. Effective API governance also helps organizations to identify and mitigate risks associated with APIs, such as security vulnerabilities, compliance issues, and performance concerns. By implementing API governance best practices, organizations can optimize their API portfolio, improve collaboration across teams, and increase the value derived from their API investments. The diagram illustrates the various components that make up an API governance framework. At the center of the diagram is the API governance board, which is responsible for overseeing and managing the governance process. The board is made up of representatives from different business units and technology teams within the organization. The API governance framework comprises multiple critical components that work together to enable effective API management across an organization. These components include security, technology, utilization, education, monitoring, standards, performance, and compliance. API Security The security component is vital in ensuring that APIs are designed and implemented with robust security features to prevent unauthorized access, data breaches, and other security risks. The technology component focuses on selecting the most appropriate technology stack for the API and ensuring that it integrates seamlessly with other existing systems. API security is a critical component of API governance, as it involves protecting APIs from unauthorized access, misuse, and other security threats. To ensure the security of APIs, various measures can be taken, such as OWASP testing, penetration testing, API utilization monitoring, code reviews, and authentication and authorization mechanisms. OWASP testing is a standardized security testing process that aims to identify potential security vulnerabilities in APIs. It involves various security tests, such as injection attacks, cross-site scripting, and access control issues, among others. Penetration testing is another technique used to identify potential security vulnerabilities in APIs. It involves simulated attacks on the API to identify potential weaknesses that could be exploited by attackers. API utilization monitoring is a process of detecting unusual or unnatural patterns in API usage. This monitoring can help identify potential security threats and protect against API misuse and abuse. Code reviews are also a vital aspect of API security. By thoroughly reviewing API code, developers can identify potential security flaws and vulnerabilities and address them before they become a threat. Authentication and authorization mechanisms are crucial in protecting APIs from unauthorized access. These mechanisms ensure that only authorized users have access to the API and that users are only able to access the resources they are authorized to access. Additionally, API endpoints must be designed in a way that distinguishes between users and administrators to prevent unauthorized access. In conclusion, implementing robust security measures such as OWASP testing, penetration testing, API utilization monitoring, code reviews, and authentication and authorization mechanisms is essential to ensure the security and integrity of APIs. By implementing these measures, organizations can protect their APIs from potential security threats and ensure the safe and secure delivery of API services to their users. API Technology API technology is a crucial component of API governance that involves selecting and managing the appropriate technology stack for APIs. To ensure the effective use of technology, several measures can be taken, such as cataloging all existing technologies, identifying new technologies to introduce and retire outdated technologies, checking Gartner Magic Quadrant, identifying production incidents and tagging them with technologies, and promoting the adaptation of new technologies while defining deadlines to retire outdated technologies. Cataloging all existing technologies allows organizations to have a clear understanding of the technologies used for API development and management. This information can help identify potential gaps, overlaps, and redundancies in the technology stack. Identifying new technologies to introduce and retire outdated technologies is another vital aspect of API technology. It ensures that the technology stack remains up-to-date and aligned with the latest industry standards and best practices. Additionally, checking the Gartner Magic Quadrant can help organizations identify emerging technologies and evaluate their suitability for API development and management. Identifying production incidents and tagging them with technologies is also critical in evaluating the effectiveness of the current technology stack. It helps identify potential areas of improvement and identify technologies that are not meeting business requirements. Finally, promoting the adaptation of new technologies and defining deadlines to retire outdated technologies can help ensure that the technology stack remains current and aligned with the organization's business goals. It also ensures that the organization is using the most appropriate technologies for API development and management. Overall, effective API technology management involves the careful selection and management of the appropriate technology stack for APIs. By implementing measures such as cataloging existing technologies, identifying new technologies, evaluating production incidents, and promoting the adaptation of new technologies, organizations can ensure the efficient and effective use of technology for API development and management. API Utilization API utilization is a critical component of API governance that involves monitoring and optimizing the use of APIs to ensure maximum value and efficiency. To achieve this, several measures can be taken, such as centralizing API utilization on a dashboard, retiring or merging unused APIs, scaling and optimizing highly used APIs, exposing an API Catalog to increase utilization, publishing APIs and monetizing usage across the organization, and calculating API costs and enforcing cost optimization. Centralizing API utilization on a dashboard allows organizations to have a clear view of API usage across various applications and services. This dashboard can help identify potential bottlenecks, over-utilized or under-utilized APIs, and areas that require optimization. Retiring or merging unused APIs is another important aspect of API utilization. It ensures that APIs are not consuming unnecessary resources and reduces the overall complexity of the API ecosystem. Scaling and optimizing highly used APIs is also essential to ensure that APIs can handle increased traffic and usage effectively. By identifying highly used APIs, organizations can optimize their performance, increase their scalability, and improve overall user experience. Exposing an API Catalog, such as SwaggerHub or similar, can increase API utilization by making it easier for developers to discover and reuse existing APIs instead of building new ones. This not only saves time and resources but also promotes consistency and reduces the overall complexity of the API ecosystem. Publishing APIs and monetizing usage across the organization is another effective way to increase API utilization. By monetizing API usage, organizations can create incentives for developers to use existing APIs and reduce the development of redundant APIs. Finally, calculating API costs and enforcing cost optimization is critical in managing the overall cost of API utilization. By identifying the runtime costs, such as CPU, network, and log volume, organizations can enforce cost optimization measures and minimize unnecessary costs. Overall, effective API utilization management involves monitoring and optimizing API usage to ensure maximum value and efficiency. By implementing measures such as centralizing API utilization, retiring unused APIs, optimizing highly used APIs, exposing an API Catalog, publishing APIs, monetizing usage, and enforcing cost optimization, organizations can ensure the efficient and effective use of APIs while minimizing costs. API Monitoring API monitoring is a critical aspect of API governance that involves tracking and analyzing the performance, availability, and security of APIs. To ensure effective API monitoring, organizations can introduce best practices and implement several measures. Firstly, it is essential to establish best practices in API monitoring. This includes setting up a dedicated monitoring system and defining monitoring metrics and thresholds to ensure the health and performance of APIs. This system should track key performance indicators such as response time, error rates, and uptime. Publishing log retention and volume numbers is another important aspect of API monitoring. It helps organizations to identify trends and patterns in API usage and performance, which can aid in decision-making and optimization. For example, organizations can use this data to identify over-utilized APIs that need to be optimized or to detect unauthorized access or security breaches. Setting alerts for unusual patterns, spikes, and other anomalies can help organizations quickly identify and respond to issues. This ensures that any problems are addressed promptly and reduces the risk of prolonged downtime or service disruptions. Finally, monitoring APIs for SQL injections and other attacks is critical to ensuring API security. Organizations should implement measures to prevent and detect such attacks, such as input validation, encryption, and access controls. This helps to minimize the risk of unauthorized access or data breaches and ensures that APIs are secure and compliant. In summary, effective API monitoring involves implementing best practices, publishing log retention and volume numbers, setting up alerts for unusual patterns, and monitoring for SQL injections and other attacks. These measures help to ensure the health, performance, and security of APIs, minimize downtime and disruptions, and maximize their value to the organization. API Standards API standards are a crucial component of API governance, as they ensure that APIs are developed and maintained consistently across the organization. To ensure that API standards are followed, organizations can implement several measures. One key standard that should be followed is the OpenAPI standard, which defines a standard way to describe RESTful APIs. Adhering to this standard ensures that APIs are well-documented, easy to understand, and interoperable with other APIs. To automate the audit process and find issues, organizations can implement tools that scan API code and configurations for potential issues, such as security vulnerabilities or compliance violations. This helps to ensure that APIs are secure, compliant, and high-quality. Another important standard is the requirement for teams to commit API YAML files and Postman collections. This ensures that API documentation and testing artifacts are kept up to date and easily accessible to other teams. It also helps to ensure that APIs are well-documented, tested, and ready for use. Finally, supporting canary, blue/green releases, and versioning is critical to ensuring that APIs can be deployed and managed effectively. Canary releases involve deploying new features to a small subset of users to test their impact before releasing them to the wider audience. Blue/Green releases involve maintaining two identical environments (blue and green) and switching traffic between them to ensure that updates are rolled out smoothly. Versioning involves assigning a unique version number to each API to ensure that changes to one version do not affect other versions. In summary, API standards are critical to API governance, and organizations can implement several measures to ensure that they are followed, including adhering to the OpenAPI standard, automating the audit process, requiring teams to commit API YAML files and Postman collections, and supporting Canary, Blue/Green releases, and versioning. These measures help to ensure that APIs are well-documented, tested, secure, compliant, and deployed and managed effectively. API Performance API performance is a crucial aspect of API governance, as it directly impacts the user experience and operational costs of an organization's API portfolio. To ensure that APIs perform optimally, several measures can be implemented. One key measure is to monitor API performance using tools such as Kibana, AppDynamics, or Nginx. These tools provide insights into API response times, error rates, and other performance metrics. By monitoring API performance, organizations can quickly identify issues and take corrective action. Another important measure is to set alerts for low-performing APIs and inform teams to act. This ensures that performance issues are identified and addressed in a timely manner, minimizing the impact on users and reducing operational costs. Low-performing APIs can lead to high consumption costs and customer dissatisfaction. Users expect APIs to be fast and responsive, and any delays or errors can lead to frustration and lost business. By optimizing API performance, organizations can provide a better user experience and reduce operational costs. In summary, API performance is a critical aspect of API governance, and organizations can implement several measures to ensure that APIs perform optimally, including monitoring API performance using tools such as Kibana, AppDynamics, or Nginx, setting alerts for low-performing APIs and informing teams to act, and optimizing API performance to provide a better user experience and reduce operational costs. API Compliance API compliance is a crucial aspect of API governance, as it ensures that APIs adhere to internal corporate policies and regulatory requirements. There are two main types of compliance: corporate and regulatory. Corporate compliance refers to adherence to internal policies and standards that govern API development and usage within an organization. These policies can include data security, privacy, and governance. It is important to identify and agree on which corporate compliance policies to follow, and to ensure that APIs adhere to these policies. Regulatory compliance refers to adherence to external regulations and standards that govern API development and usage. These regulations can include industry-specific standards, such as HIPAA for healthcare or PCI DSS for payment card data. It is important to identify which regulatory compliance standards apply to an organization's APIs and ensure that APIs adhere to these standards. To ensure API compliance, continuous scans, such as static and dynamic scans, can be performed to identify potential technology compliance issues. For example, a static scan can analyze the code for security vulnerabilities, while a dynamic scan can simulate an attack on the API to identify potential vulnerabilities. It is important to shift left API security by integrating compliance scans into the development process early on, as this can help identify and address compliance issues early in the development cycle. In summary, API compliance is a critical aspect of API governance, and organizations can implement several measures to ensure that APIs adhere to corporate and regulatory compliance policies, including identifying and agreeing on which compliance policies to follow, performing continuous scans to identify potential compliance issues, and shifting left API security by integrating compliance scans into the development process early on. In Conclusion API governance is a complex and multi-faceted discipline that involves many components, including security, technology, compliance, utilization, monitoring, performance, and education. By implementing best practices in each of these areas, organizations can ensure that their APIs are secure, efficient, and compliant, and that they provide the most value to their users. API security is critical to protecting against potential attacks and vulnerabilities, and organizations can implement measures such as OWASP and PEN testing, authentication, and authorization to ensure that their APIs are secure. API technology is constantly evolving, and organizations can catalog existing technologies, identify new technologies to introduce, and retire outdated technologies to keep their APIs up to date and efficient. API compliance is essential for adhering to internal corporate policies and external regulatory requirements, and organizations can implement continuous scans to identify potential compliance issues and ensure that their APIs adhere to relevant policies and standards. API utilization involves monitoring API usage patterns, retiring unused APIs, scaling highly used APIs, and exposing an API Catalogue to increase utilization and monetization. API monitoring involves setting alerts for unusual patterns, identifying SQL injections and attacks, and promoting best practices in API monitoring. API performance is critical to ensuring that APIs operate efficiently, and organizations can monitor API performance using tools such as Kibana, AppDynamics, and Nginx, set alerts for low-performing APIs, and optimize APIs to reduce consumption costs and improve customer satisfaction.
IBM App Connect Enterprise (ACE) has provided support for the concept of “shared classes” for many releases, enabling various use cases including providing supporting Java classes for JMS providers and also for caching data in Java static variables to make it available across whole servers (plus other scenarios). Some of these scenarios are less critical in a containerized server, and others might be handled by using shared libraries instead, but for the remaining scenarios there is still a need for the shared classes capability in containers. What Is the Equivalent of /var/mqsi/shared-classes in Containers? Adding JARs to shared classes is relatively simple when running ACE in a virtual machine: copying the JAR files into a specific directory such as /var/mqsi/shared-classes allows all flows in all servers to make use of the Java code. There are other locations that apply only to certain integration nodes or servers, but the basic principle is the same, and only needs to be performed once for a given version of supporting JAR as the copy action is persistent across redeploys and reboots. The container world is different, in that it starts with a fixed image every time, so copying files into a specific location must either be done when building the container image, or else done every time the container starts (because changes to running containers are generally non-persistent). Further complicating matters is the way flow redeploy works with containers: the new flow is run in a new container, and the old container with the old flow is deleted, so any changes to the old container are lost. Two main categories of solution exist in the container world: Copy the shared classes JARs into the container image during the container build, and Deploy the shared classes JARs in a BAR file or configuration in IBM Cloud Pak for Integration (CP4i) and configure the server to look for them. There is also a modified form of the second category that uses persistent volumes to hold the supporting JARs, but from an ACE point of view it is very similar to the CP4i configuration method. The following discussion uses an example application from the GitHub repo at https://github.com/trevor-dolby-at-ibm-com/ace-shared-classes to illustrate the question and some of the answers. Original Behavior With ACE in a Virtual Machine Copying the supporting JAR file into /var/mqsi/shared-classes was sufficient when running in a virtual machine, as the application would be able to use the classes without further configuration: The application would start and run successfully, and other applications would also be able to use the same shared classes across all servers. Container Solution 1: Copy the Shared Classes JARs in While Building the Container Image This solution has several variants, but they all result in the container starting up with the support JAR already in place. ACE servers will automatically look in the “shared-classes” directory within the work directory, and so it is possible to simply copy the JARs into the correct location; the following example from the Dockerfile in the repo mentioned above shows this: # Copy the pre-built shared JAR file into placeRUN mkdir /home/aceuser/ace-server/shared-classesCOPY SharedJava.jar /home/aceuser/ace-server/shared-classes/ and the server in the container will load the JAR into the shared classloader: Note that this solution also works for servers running locally during development in a virtual machine. It also means that any change to the supporting JAR requires a rebuild of the container image, but this may not be a problem if a CI/CD pipeline is used to build application-specific container images. The server may also be configured to look elsewhere for shared classes by setting the additionalSharedClassesDirectories parameter in server.conf.yaml. This parameter can be set to a list of directories to use, and then the supporting JAR files can be placed anywhere in the container. The following example shows the JAR file in the “/git/ace-shared-classes” directory: This solution would be most useful for cases where the needed JAR files are already present in the image, possibly as part of another application installation. Container Solution 2: Deploy the Shared Classes JARs in a BAR File or Configuration in CP4i For many CP4i use cases, the certified container image will be used unmodified, so the previous solution will not work as it requires modification of the container image. In these cases, the supporting JAR files can be deployed either as a BAR file or else as a “generic files” configuration. In both cases, the server must be configured to look for shared classes in the desired location. If the JAR files are small enough or if the shared artifacts are just properties files, then using a “generic files” configuration is a possible solution, as that type of configuration is a ZIP file that can contain arbitrary contents. The repo linked above shows an example of this, where the supporting JAR file is placed in a ZIP file in a subdirectory called “extra-classes” and additionalSharedClassesDirectories is set to “/home/aceuser/generic/extra-classes”: (If a persistent volume is used instead, then the “generic files” configuration is not needed and the additionalSharedClassesDirectories setting should point to the PV location; note that this requires the PV to be populated separately and managed appropriately (including allowing multiple simultaneous versions of the JARs in many cases)). The JAR file can also be placed in a shared library and deployed in a BAR file, which allows the supporting JARs to be any size and also allows a specific version of the supporting JARs to be used with a given application. In this case, the supporting JARs must be copied into a shared library and then additionalSharedClassesDirectories must be set to point the server at the shared library to tell it to use it as shared classes. This example uses a shared library called SharedJavaLibrary and so additionalSharedClassesDirectories is set to “{SharedJavaLibrary}”: Shared libraries used this way cannot also be used by applications in the server. Summary Existing solutions that require the use of shared classes can be migrated to containers without needing to be rewritten, with two categories of solution that allow this. The first category would be preferred if building container images is possible, while the second would be preferred if a certified container image is used as-is. For further reading on container image deployment strategies, see Comparing Styles of Container-Based Deployment for IBM App Connect Enterprise; ACE servers can be configured to work with shared classes regardless of which strategy is chosen.
APIs/Services performance plays an important role in providing a better experience to the users. There are many ways available through which we can make our APIs/services perform better. In this article, we are going to see some such tips to improve the performance of the APIs/Services. Below are some of the metrics that we need to consider while optimizing the performance of APIs/Services: Response Time: Measures how quickly the APIs/Services are responding. Payload and its size: Amount of data transported over the network. Throughput: Number of times the APIs/Services are being invoked. The above parameters need to be optimized to make the APIs/Services perform better. Below are some of the techniques which can help to improve those: Caching This is one of the widely used techniques to reduce the hits going to the server while calling the APIs/Services and thereby reducing the time taken to serve the requests to the client. Caching can be done at the CDN layer or at the API Gateway layer, or in both, depending upon the need to avoid the calls going to the server. APIs that return the same response without any change in the contents can be considered for this caching. Cache duration has to be set accordingly to fetch the latest response from the server. Compression Compressing the contents before transporting them over the network avoids the latency at the network layer. Compressions like Brotli or GZip can be used to achieve this. Though it adds a processing overhead of compression/decompression, the amount of time spent on it will be lesser than the amount of time spent transporting the data over the network. Payload Size The amount of data returned by the server as a response to an API plays a major role in deciding the performance of that API. Sending only the needed data to the client reduces the network and other overheads in transporting the data to the client. Logging Avoid excessive logging of data to the log files, as this makes the server spend time on logging instead of spending time processing the request. Set to appropriate log level to avoid detailed logging. As a best practice, consider checking for log levels before logging, irrespective of the log levels configured. This would avoid the string concatenation or object conversion operations that happens while the parameter is being passed to the logger methods. This would save the memory and execution time of the code. For example, in Java, the below statement will do the concatenation of the String even though the log level is set as INFO. log.debug("This is a test "+"Message"); Asynchronous logging is another option that can be considered to improve performance, as it doesn't wait for the logging to be completed. The only drawback of this approach is that there may be a delay in logging the details into the log file. Optimizing the API/Service Code Following are some of the best practices that can be followed to optimize the code at the server level: Caching frequently accessed contents on the server. For example, frequently accessed data from DB can be cached to avoid the calls going to the database. Avoiding conversion of contents from one format to another. For example, converting JSON to String is a costlier operation in Java and should be avoided as much as possible. Fetch only the data needed from the database or from any other data source. Use connection pooling to connect to DB or any other data sources. Eager initialize the contents/objects/details that are needed for the application during the start-up in order to avoid the time spent on the initial requests. Avoid/reduce the synchronization area/block within the code. Configure proper timeouts to external/third-party calls to avoid thread waits. Auto-Scaling Auto-scaling is another capability through which we can improve the performance of the APIs/Services by scaling up/down the number of application instances to serve based on the incoming traffic. The capacity of the single instance needs to be measured, and the number of instances to be scaled up/down can be decided to depend upon the expected traffic to the APIs/Services. Apart from that, the size of the instance can be decided based on the infrastructure needed or available. Finalize the instance size and scaling needs by conducting performance testing before the application goes to production. Settings Optimize application and application server-related settings for optimal performance. Below are some of the areas which can be optimized for better performance: Connection Pool Thread Pool Memory and GC Cache/Cache framework Auto-scaling Web server Other framework/third-party applications used in the application. It would be good to tune these settings by conducting performance testing that simulates the production-like behavior. Optimizing the performance of APIs/Services helps to improve the overall user experience on the application/website and also optimizes the infrastructure used by it. This can be achieved using some/all of the above-listed best practices or tuning tips. Of course, there are pros and cons to all these approaches. Strike the right balance based on the needs of your application.
In today's rapidly evolving technology landscape, cloud infrastructure has become an indispensable part of modern business operations. To manage this complex infrastructure, documenting its setup, configuration, and ongoing maintenance is critical. Without proper documentation, it becomes challenging to scale the infrastructure, onboard new team members, troubleshoot issues and ensure compliance. At Provectus, I have witnessed the advantages of handing over projects with proper documentation and how it allows successful transition and preserves customer satisfaction. Whether you are an active engineer, an engineering team leader, or a demanding user of cloud infrastructure, in this article, I will help you to understand the importance of documentation and offer some easy steps for implementing best practices. Why Is Documentation Important? Documentation is a key feature that allows for the consistent maintenance of any process. It is a storehouse of intelligence that can be accessed for future reference and replicated if needed. For example, if an engineer or anyone in the organization has performed, tested, and improved a process, failure to document it would be a waste of intellectual capital and a loss to the organization. Documentation is important for many reasons: It helps to keep processes and systems up to date for usage It helps with the onboarding and training of new team members It helps to improve security by imposing boundaries It functions as a means of proof for audits It provides a starting point when documenting from scratch It helps to continuously improve processes Documenting your cloud infrastructure is imperative for its smooth and efficient operation. What Should Be Documented for Cloud Infrastructure? In the past, building a computing infrastructure required huge investment and vast planning, taking into consideration the required expertise in the field and the needs of your organization. Once servers and hardware were purchased, it was very difficult to make any significant changes. The cloud brought with it significant improvements, making it much easier and more feasible to implement the infrastructure. But still, the ability to make changes and improvements is highly dependent on accurate documentation. Following is a basic list of requirements for documentation to ensure that your cloud infrastructure is easy to use and update. Architecture Diagrams An architecture diagram is a visual representation of cloud components and the interconnections that support their underlying applications. The main goal of creating an architecture diagram is to communicate with all stakeholders — clients, developers, engineers, and management — using a common language that everyone can understand. To create a diagram, you need a list of components and an understanding of how they interact. You may need to create multiple diagrams if the architecture is complex or if it has several environments. There are user-friendly tools to help you with this first step, many of which are free. For example, Diagrams.net (formerly Draw.io), Miro, SmartDraw, Lucidchart, and others. Creating an architecture diagram will help with future planning and design when you are ready to improve the infrastructure. It will help you to easily spot issues or areas that need improvement. Your diagram can also help with troubleshooting. Engineers will be able to use it to detect flaws in the system and discover their root causes. It will also help with compliance and security requirements. How-To Instructions Your infrastructure will likely host many features and applications that require specific steps for access. How-to instructions provide end users with a detailed step-by-step guide that streamlines various processes and saves time. Such instructions are sometimes referred to as detailed process maps, DIYs (do it yourself), walkthroughs, job aids, tutorials, runbooks, or playbooks. Some examples of processes that can benefit from how-to instructions include: How to request access for developers How to subscribe to an SNS topic How to rotate IAM Access Keys How to retrieve ALB logs Policies Your cloud infrastructure will have its own policies, whether they are predefined by the IT department or created in collaboration with different teams. Some policies that can be documented include: Access policies: What security measures are in place, and what is required for various individuals, groups, or roles to gain access? What are the premises and procedures for access removal? Are we compliant with the least privilege access best practice? Security policies: Protective policies for management, practices, and resources for data in the cloud. Data privacy policies: Data must be classified and collected in ways that keep it secure and protected from unauthorized access. Compliance policies: Which regulations and auditing processes must be complied with to use cloud services? What are the responsibilities of Infrastructure team members? Incident and change management: Define the necessary steps to respond to incidents and changes; define outage prioritization, SLA response time, ownership, and post-mortem processes. Monitoring: Along with incident management, there should be documentation of monitors and channels in place to ensure that the infrastructure is up and running. Monitoring is a 24/7 preventative approach to incident management. Disaster Recovery A Disaster Recovery plan is one of the most important yet least prioritized documents. It should outline the procedures needed to restore services after a disaster event. The document should cover at least the following items: Scope Steps to restore service as soon as possible How to determine damage or data loss, risk assessment Emergency response — who should be notified and how? Steps to back up all data and services The main goal of a Disaster Recovery plan is to ensure that business operations continue, even after a disaster. Failure to recover presents a large gap in the infrastructure. Best Practices You Should Follow Formatting When creating documentation, it is important to follow certain rules. Let's identify them: Organization: A stable company will usually have a brand book that establishes boundaries and provides guidelines for content. In the case of documentation, you may need to use a specific font, size, and layout, and you may be required to include a logo or other elements. Before documenting, find out what the company requirements are. If there are no established guidelines, create your own to establish consistency across your department. Grammar: The way you communicate while documenting should also follow a standard. Some best practices include: Use an active voice, i.e., The entire infrastructure is described as code via terraform. Avoid a passive voice, i.e.: Terraform was used to describe the entire infrastructure as code. Avoid long sentences. Stick with simple structured sentences that are easy for the reader to follow. Create a glossary of abbreviations, and use consistent terminology. For example, if you mention an SSL certificate but you use TLS instead, the reader might be confused. Use appropriate verb tenses: For example, use the present tense for describing a procedure and the past tense for describing a completed action. Storage: When saving the document, always use a conventional name that makes it easy to find and share with others. Store the file in the most appropriate path or structure, such as a particular file system or a collaborative tool like Confluence. File naming example: departmentname_typeofdocument_nameofdocument_mm_yyyy ManagedServices_internal_stepsfordocumentation_03_2023 Content How you display your document’s content plays a relevant role in the entire process. A document that is attractively laid out and easy to read will help to prevent confusion and avoid unnecessary questions. Here are some tips for you: Screenshots: A picture is worth a thousand words. Use screenshots to help the user better relate to your instructions. Within your AWS Account, go to the EC2 Dashboard and check the Security groups. Diagrams: A flow chart provides a visual aid to help you describe a step-by-step process so that the reader can easily identify which step they are on. Open the console Ping the corresponding IP If you get an error, copy and paste the message Open ticket in AnyDesk Paste the error message Assign to AnyTeam Table of contents: Use heading formats to create a table of contents. If the document is quite large, the reader will have the option to jump to a specific section. That reader could be you, wanting to update the document a few months later! Troubleshooting: Readers will likely have some issues when putting your document into action. Be sure to include a troubleshooting section to help resolve common problems. Lifecycle One of the most common mistakes in documenting is to think documentation is over because the project is up and running. Keeping your documentation up to date is an important part of the documentation lifecycle: Maintenance: Considering that your Infrastructure is constantly changing, your documentation must be kept current. Outdated documentation will misinform others and could trigger disastrous actions. Back-up: Always keep a backup of your documents. Ideally, your place of storage should have certain features by default, like versioning control, searching, filtering, collaboration, etc. But it is also a good practice to keep your own backup – it might be useful one day. Share: Once you have completed documentation, share it with potential users and ask for feedback. They can help suggest improvements that make your documentation more robust. Conclusion If you are not 100% convinced about the benefits of documentation, think of it this way: No one wants to waste time figuring out someone else’s work or reinventing the wheel by creating a project that has already grown and evolved. Documentation that is clear, concise, and easy to understand is the first step toward building a successful cloud infrastructure.