PaaS for Java developers - Part 3

Marketplace services

I want to start part 3 by saying that I really do like and recommend Pivotal Web Services and Cloud Foundry as a simple and robust way to deploy Java applications. I've been running Structurizr on Pivotal Web Services for over 3 years now and I've had very few issues with the core platform. The marketplace services, on the other, are a different story.

In addition to providing a deployment platform to run your code, most of the Platform as a Service providers (Pivotal Web Services, Heroku, Azure, etc) provide a collection of "marketplace services". These are essentially add-on services that give you easy access to databases, messaging providers, monitoring tools, etc. As I write this, the Pivotal Web Services marketplace includes many of the popular technologies you would expect to see; including MySQL, PostgreSQL, Redis, Memcached, MongoDB, RabbitMQ, etc.

MySQL as a service

Let's imagine that you're building a Java web application and you'd like to store data in a MySQL database. You have a few options. One option is to build your own database server somewhere like Amazon AWS. Of course, you need to have the skills to do this and, given that part 1 was all about the benefits of PaaS over building your own infrastructure, the DIY approach is not necessarily appealing for everybody. Another option is to find a "Database as a Service" provider that will create and run a MySQL server for you. ClearDB is one such example, and it's also available on the Pivotal Web Services marketplace. All you need to do is create a subscription to ClearDB through the marketplace (there is a free plan), connect to the database and create your schema. That's it. Most of the operational aspects of the MySQL database are taken care of; including backups and replication.

To connect your Java application to ClearDB, again, you have some options. The first is to place the database endpoint URL, username and password in configuration, like you might normally do. The other option is to use the Cloud Foundry command line interface to issue a "cf bind" command to bind your ClearDB database instance to your application instance(s), and use Cloud Foundry's auto-reconfiguration feature. If you're building a Spring-based application and you have a MySQL DataSource configured (some caveats apply), Cloud Foundry will automagically reconfigure the DataSource to point to the MySQL database that you have bound to your application. When you're getting started, this is a fantastic feature as it's one less thing to worry about. It also means that you don't need to update URLs, usernames and passwords if they change.

I used this approach for a couple of years and, if you look at the Structurizr changelog, you can see the build number isn't far off 1000. Each build number represents a separate (automated) deployment to Pivotal Web Services. So I've run a lot of builds. And most of them have worked. Occasionally though, I would see deployments fail because services (like ClearDB) couldn't be bound to my application instances. Often these were transient errors, and restarting the deployment process would fix it. Other times I had to raise a support ticket because there was literally nothing I could do. One of the big problems with PaaS is that you're stuck when it goes wrong, because you don't have access to the underlying infrastructure. Thankfully this didn't happen often enough to cause me any real concern, but it was annoying nonetheless.

More annoying was a little bug that I found with Structurizr and UTF-8 character encoding. When people sign up for an account, a record is stored in MySQL and a "please verify your e-mail address" e-mail is sent. If the person's name included any UTF-8 characters, it would look fine in the initial e-mail but not in subsequent e-mails. The problem was that the UTF-8 characters were not being stored correctly in MySQL. After replicating the problem in my dev environment, I was able to fix it by adding a characterEncoding parameter to the JDBC URL. Pushing this fix to the live environment is problematic though, because Cloud Foundry is automatically reconfiguring my DataSource URLs. The simple solution here is to not use automatic reconfiguration, and it's easy to disable via the Java buildpack or by simply not binding a MySQL database instance to the Java application. At this point, I'm still using ClearDB via the marketplace, but I'm specifying the connection details explicitly in configuration.

The final problem I had with ClearDB was earlier this summer. I would often see error messages in my logs saying that I'd exceeded the maximum number of connections. The different ClearDB plans provide differing levels of performance and numbers of connections. I think the ClearDB databases offered via the marketplace are multi-tenanted, and there's a connection limit to ensure quality of service for all customers. And that's okay, but I still couldn't work out why I was exceeding my quota because I know exactly how many app instances I have running and the maximum number of permitted connections in the connection pools per app instance. I ran some load tests with Apache Benchmark and I couldn't get the number of open connections to exceed what had been configured in the connection pool. Often I would be watching the ClearDB dashboard, which shows you the number of open connections, and my applications wouldn't be able to connect despite the dashboard only showing a couple of live connections.

Back to vendor lock-in and migration cost. The cost of migrating from ClearDB to another MySQL provider is low, especially since I'm no longer using the Cloud Foundry automatic reconfiguration mechanism. So I exported the data and created a MySQL database on Amazon RDS instead. For not much more money per month, I have a MySQL database running in multiple availability zones, with encrypted data at rest and I know for sure that the JDBC connection is happening over SSL (because that's how I've configured it).

E-mail delivery as a service

Another marketplace service that I used from an early stage is SendGrid, which provides "e-mail delivery as a service". There's a theme emerging here! Again, you can run a "cf bind" command to bind the SendGrid service to your application. In this case, though, no automatic reconfiguration takes place, because SendGrid exposes a web API. This raises the question of where you find the API credentials. One of the nice features of the marketplace services is that you can get access to the service dashboards (e.g. the ClearDB dashboard, SendGrid dashboard, etc) via the Pivotal Web Services UI, using single sign-on. The service credentials are usually found somewhere on those service dashboards.

After finding my SendGrid password, I hardcoded it into a configuration file and pushed my application. To my surprise, trying to connect to SendGrid resulted in an authentication error because my password was incorrect. So I again visited the dashboard and yes, the password was now different. It turns out that, and I don't know if this is still the case, the process of running a "cf bind" command would result in the SendGrid credentials being changed. What I didn't realise is that service credentials are set in the VCAP_SERVICES environment variable of the running JVMs, and you're supposed to extract credentials from there. This is just a regular environment variable, with JSON content. All you need to do is grab it and parse out the credentials that you need, either using one of the many code samples or libraries on GitHub to do this. From a development perspective, I now have a tiny dependency on this VCAP stuff, and I need to make sure that my local Apache Tomcat instance is configured in the same way, with a VCAP_SERVICES environment variable on startup.

Some time later, SendGrid moved to v3 of their API, which included a new version of the Java library. So I upgraded, which resulted in the API calls failing. After signing in to the SendGrid dashboard, I noticed that I now have the option of connecting via an API key. Long story short, I ditched the VCAP stuff and configured the SendGrid client to use the API with the API key, which I've also added to my deployment configuration.

Other services

I used the Pivotal SSL Service for a while too, which provides a way to upload your own SSL certificate. When used in conjunction with the Cloud Foundry router, you can serve traffic from your own domain name with a valid SSL certificate. I also had a few issues with this, resulting in downtime. The Java applications were still running and available via the cfapps.io domain, but not via the structurizr.com domain. I've since switched to using CloudFlare's dedicated SSL certificate service for $5 per month. I did try the free SSL certificate, but some people reported SSL handshake issues on some corporate networks when uploading software architecture models via Structurizr's web API.

I also used the free Redis marketplace service for a while, in conjunction with Spring Session, as a way to store HTTP session information. I quickly used up the quota on that though, and found it more cost effective to switch to a Redis Cloud plan directly with Redis Labs.

PaaS without the marketplace

There are certainly some benefits to using the marketplace services associated with your PaaS of choice. It's quick and easy to get started because you just choose a service, subscribe to it and you're ready to go. All of your services are billed from, and managed, in one place, so that's nice too. And, with Cloud Foundry, I can live with configuration via the VCAP_SERVICES; at least everything is in one place.

If you're just starting out with PaaS, I'd certainly take a look at the marketplace services on offer. Your mileage may vary, but I find it hard to recommend them for production use though. As I said at the start of this post, the core PaaS functionality on Pivotal Web Services has been solid for the three years I've been using it. Any instability I've experienced has been around the edge, related to the marketplace services. It's also unclear what you're actually getting in some cases, and where the services are running. If you look at the ClearDB plans, the free plan ("Spark DB") says that it's "Perfect for proof-of-concept and initial development", whereas the $100 per month "Shock DB" plan says "Designed for apps where high performance is crucial". These plans are not listed on the ClearDB website, so it's hard to tell whether they are multi-tenant or single-tenant services. Some of the passwords created by marketplace services also look remarkably short (e.g. 8 characters) considering they are Internet-accessible.

With all of this in mind, I prefer to sign up with a service directly and integrate it in the usual way. I don't feel that the pros of using the marketplace services outweigh the cons. I'm also further reducing my migration cost, should I ever need to move away from my PaaS. In summary then, the live deployment diagram for Structurizr now looks like this:

Structurizr - Deployment

The Java applications are hosted at Pivotal Web Services, and everything else is running outside, yet still within Amazon's us-east-1 AWS region. This should hopefully help to address another common misconception that you need to run everything inside of a PaaS environment. You don't. There's nothing preventing you from running Java applications on a PaaS, and have them connect to a database server that you've built yourself. And it gives you the freedom to use any technology you choose, whether it's available on the marketplace or not. You do need to think about collocation, performance and security, of course.

So that's a summary of my experience with the marketplace services. In part 4 I'll discuss more about my build/deployment script, and how straightforward it is to do zero-downtime, blue-green deployments via Cloud Foundry. Comments or questions? Tweet me at @simonbrown.

About the author

Simon is an independent consultant specializing in software architecture, and the author of Software Architecture for Developers (a developer-friendly guide to software architecture, technical leadership and the balance with agility). He’s also the creator of the C4 software architecture model and the founder of Structurizr, which is a collection of open source and commercial tooling to help software teams visualise, document and explore their software architecture.

You can find Simon on Twitter at @simonbrown ... see simonbrown.je for information about his speaking schedule, videos from past conferences and software architecture training.




Add a comment Send a TrackBack