I wanted to put together some thoughts on high level best practices for your dev shop. These practices will skew towards larger dev shops inside of big corporations but can also be applied to 1-2 man teams. These thoughts are inspired by a combination of The Pragmatic Programmer and The Joel Test. I read both of these sources early in my career and they have continuously provided inspiration on what kinds of practices highly effective dev team should follow. Futhermore I’ve found it surprising how many shops I’ve worked with that break these practices. Following these practices will increase your dev team’s effectiveness, making for happy developers and manager. Breaking these practices will result in a slower development velocity and frustrated developers and managers.
- Streamline your local build process.
Do you think your build process is pretty fast? It could be faster. If it takes more than a few seconds to redeploy your code you lose your train of thought – and there is a cost to putting that train back on the rails. As someone who has spent a lot of time doing Enterprise Java development, it often amazes me what developers will put up with in their build processes. Here are a couple of tips to decrease build time:
- Your environment probably supports some sort of hot swap method where you don’t need to do a full build after every change. Java’s JVM Hot Swap feature allows this for some cases. I’ve also heard very good things about JRebel.
- If you have a medium to large sized application, you probably don’t need to rebuild the whole thing all of the time. Make sure you’re not rebuilding more components than necessary. Here are some kinds of things I’ve seen automatically included in a full build that could be excluded from a “fast” version of your build: code generated from wsdl/xsd, database migrations, unit test suite execution, minification/obsfucation of js/css. In Java, the war/ear step will package all of your files into a war/ear. When you deploy that war/ear to your application server it will unpackage all of the files. You can eliminate a step by just copying your war/ear directory directly into your application server.
- Use a debugger.
Do you test your application by inserting new log statements and redeploying? Stop it! One of the best ways to cut down on redeploy time is to not have to redeploy. Every single commercially used language supports debugging. Here are a few links to get you started using a debugger with your language
- Automate your QA builds and minimize QA downtime
Your QA build process should be quick and painless. I would shoot for a combination of automated QA builds (2 or more times per day) and one-click, on-demand deployments. The outage to a QA environment needs to be small – 5 minutes or less – to keep the deployments painless enough to not disrupt the workflow of the QA staff.
Doing frequent, automatic QA builds does a couple of great things for your IT organization- it sets a precedent that your team has the ability to quickly turn around fixes. It also minimizes the impact of bugs in your QA process; After all, if your team finds a bug, it’s not a huge deal since the bug might be resolved in a few hours.
Contrast this approach with an organization I recently worked with that only did two QA builds per week because their build process took so painfully long to complete. Every bug became a news headline. The management team would often choose not to fix bugs because of the huge turnaround time and the risk of additional lost time in case a bug fix was not successful.
- Treat database schema updates like source control
SQL is code, and your database structure is the product of that code. In the same way that every developer working on a project needs to be able to run the software on their local system, those developers also need to be able to run their own copies of the database. Database changes should be part of your normal build process, and like your code, your database schema needs a version so you know which changes have been applied. This helps keep your development, test, and production environments all in a sane state.
It’s easy to fall into the trap where a database is very difficult to create so you end up running one development database for all of your developers to share. The problem comes when you need to support multiple work streams at the same time – this means you need to simultaneously run multiple versions of the schema at the same time. You can work around this problem by adding more schemas and more hardware, but the overhead involved may make it really difficult for developers to do the kind of risk taking and rapid iteration required to reach a high velocity.
That’s all for now. Please send feedback in the comments if this post has helped provide value for your team, or even if it seems too rudimentary. I’d be happy to do a deeper dive on any one of these topics if there is interest.
Beer Dogging is a member organization committed to the promotion of Craft Beer. Recently the owner of Beer Dogging, Don DiBrita, asked me to build him an app to his members deals to local bars and other Craft Beer related venues. I wanted to share the process that went into building it since the results turned out pretty decent and within a short timeframe.
-Give users quick access to beer dogging deals, sorted by proximity
-Keep Costs Low
As you can tell from the tile of this post, the technology stack includes Phonegap and AngularJS, but we also have a server components using Rails. I believe that this stack gives you the maximum amount of exposure for the lowest cost, at least for small to medium sized apps.
Mobile Client: Phonegap, AngularJS, Topcoat, Google Analytics
The app is built in Phonegap, so that it can work on both Android and IOS platforms without much trouble. The latest version at the time was Phonegap/Cordova 3.3.0. Cordova as a platform has gained a lot of maturity over the last couple of years but still has a ways to go. I’m pleasantly surprised at how well their new command line tools help you manage your project across multiple platforms, but still find some gaps in the workflow, and cases where plugins just stop working unexpectedly and need to be re-installed. I stayed away from Phonegap Build on this project, opting instead to use the native SDKs on my OSX environment – it’s just faster IMO. I also wanted to share my phonegap development workflow; hopefully it will help someone out there since it took me a while to figure out myself.
- Create the project with command line tools
- Add the android and IOS ‘platforms’ with command line tools
- Start a Ripple Emulator for your android platform.
- Setup a grunt task to automatically sync your web resources to the android platform
- Modify web resources in the www/ folder. After they sync to to your android platform, refresh your ripple emulator to test the changes.
- Repeat until you’re ready to test on a physical device or a platform emulator
- Build & test on a native SDK (XCode or Anroid SDK)
I’m using the Google Analytics Phonegap plugin (GAPlugin) for metrics gathering, and was happy about the amount and quality of metrics it gives you for not a lot of work. This plugin allows GA to recognize your phonegap application as a real ‘mobile app’ instead of web app.
To polish off the UI and give the app that ‘mobile’ feel I used Topcoat .
Server Side : Rails, Heroku, Postres, Geocoder, RABL
On the server side we’re using a low-frills Ruby on Rails web application, who’s sole purpose is to serve JSON data about places and deals to the mobile application. It also has an admin web interface where an admin creates/updates/deletes places and deals. I used the RABL gem to format the JSON responses.
Geocoder is a ruby gem that helps you do geolocation queries (find all places within a 10 mile radius of me). It is a must if your Rails app has any kind of location data. I also use it in http://findmybeer.com.
I used Heroku for application hosting both for ease of use and cost. We are using the single free instance today and have the ability to scale up to meet demand. On the database side I’m using the free instance of postgreSQL that heroku provides. It works for me and supposedly has pretty good geo query performance.
I want to talk for a minute about some of the design choices we made during the creation of this app. Unlike a lot of sites with a membership component, you are not required to login or verify your membership status on the app in order to view or try to redeem deals. We purposely decided to take a low-tech approach to this problem for now; Beer Dogging members already have physical wallet sized member cards, so in order to redeem an ‘Alpha Dog’ deal you have to show your member card in addition to showing your server/bartender the deal within the app. This choice aligns with our Lean product development philosophy, and allowed the client to keep the cost down during initial development. It is something we can add in the future when this app starts to take off.
You work in software and your stack includes an Oracle database. One day the business approaches you and says ‘I want a search page for our product/order/customer data. Make it work like Google’. You think to yourself, “If I could make a search page work like Google I would work at Google”!!! Fear not, developer. This problem has been solved many times in the past. In this blog post I’m going to show you how to approach this problem, and show you a shortcut in case your environment’s stack includes an Oracle database.
Approaching Full Text Search
The problem you’re solving has a name and that name is Full Text Search. The problem is that your Relational database, while presumably well normalized, is not good at searching for single words across huge data sets. You need a different kind of database which is optimized for full text search. A Search database will physically store the data differently so that it can quickly look up your search terms and return some metadata associated with those terms. In your RDMBS, records are identified by keys. In your Search index, they keys are the search terms.
There are several well known full text search solutions. The bare minimum list you should probably know about is Solr/Lucene, Sphinx, and ElasticSearch. These are all great full text search solutions, but they all require a lot of overhead to operate. New servers, new software to install, new syntaxes to learn, admin consoles, and new interfaces or libraries to build into your front end application.
Oracle Stack Solution: Oracle Text
One drawback of each of the aforementioned search solutions is that you will likely want to run it on a dedicated machine (or VM). If you work in an Oracle shop it likely means that you work in an enterprise where provisioning hardware (even virtual hardware) can be annoyingly difficult and time-consuming process. I this environment, Oracle Text jumps out as a really nice solution. Oracle Text is a full text search solution that is built in to all modern version of Oracle’s database. This means that you don’t have to request a new machine, and request for new software to be installed on that machine in each of your QA and Production environments (or request for root access to do it yourself). With Oracle Text you just run some DDL to create the index and start using it!* The only hardware issue you should consider is the amount of disk in use on your Oracle database.
Here’s a simple example of how to take advantage of an Oracle Text search index. Let’s assume that I have a database with products and reviews (a product has many reviews) and I want to be able to return search results for both at once.
The most straight-forward way to start is to gather all of the data you want to index into a single VARCHAR2 column named SEARCH_TEXT on our PRODUCT table. If you need to index more than 4000 characters, use a CLOB.
alter table PRODUCTS add SEARCH_TEXT varchar2(4000);
Now we need to populate that column with the search data we want to index from the PRODUCTS and REVIEWS tables We are going to fetch the data into the search text column as a big space delimited string. The below query is called a correlated update, and is specific to Oracle. You can accomplish the same thing with a procedure but I find this more concise.
update PRODUCTS set SEARCH_TEXT = (
select P.name ||' '|| P.description ||' '||R.title ||' '||R.review_text
from PRODUCTS P, REVIEWS R
where P.ID = R.PRODUCT_ID
) where PRODUCTS.id = P.ID;
Next we create Oracle Text index on that column. The important part is the ctxsys.context at the end of this statement. Context is one of the three types of text indexes that oracle offers, but the best one for blocks of structured text.
create index PRODUCT_REVIEW_SEARCH_IDX on PRODUCT(SEARCH_TEXT)
indextype is ctxsys.context;
It is worth noting that you can configure the index to use a separate tablespace so that you can control where on the disk your index lives. See the docs for more info.
Next we we run a command to ‘sync‘ the index. This actually indexes the data for the first time. Run it again after you’ve inserted or updated data to update the index. In fact, you should plan on running this command periodically as part of a dbms_scheduler or whatever your enterprise’s favorite scheduler is.
Now we can run a full text search query and see some results. A statement like this will return all product records which have the word ‘paper’ in the title, description, or reviews. yay! It’s pretty awesome that we can run searches on this index in our existing RDBMS and apply whatever filters, sorts, and joins we want without having to call out to another system.
select * from PRODUCTS where contains(SEARCH_TEXT, 'paper') > 0;
Finally, we create a job to periodically ‘optimize‘ the index. According to the docs your index gets fragmented and slower over time and this will fix it up. I’ve had luck with running this nightly but YMMV.
After you’ve got your index up and running you can get some useful info and stats out of it with the CTX_REPORTS package. Among other things it will tell you how fragmented your index is, and what words are the most frequently indexed.
I’ve really just scratched the surface to show you how to get a text index up and running fast. Oracle has a ton of options to tune the index, and search features like fuzzy searching, stemming, and wildcards.
*Ok, maybe you should still consult a DBA first if you have access to one.
You have an established business built around a mature product. Customers are buying, revenue is being generated, and presumably there are multiple people making a comfortable living to ensure that things stay that way. Inevitably what follows is a culture of devotion to the status quo, or perhaps to marginal improvements (like process improvement and cost reduction). The organization becomes extremely risk averse. While the market around your business never stops changing, your slow-moving company can either adapt or die.
One day an innovator in the organization identifies a new business opportunity and approaches you with it.
Hey, there’s a new market we could capture. It’s big. We just need to take a little risk to get to it.
Senior Management may say “We’re not in the business of taking new risks. We’re in the business of selling what we have.” And we can’t blame senior management for this response. It’s political – and the stakes are huge. If a senior executive gets behind a new risky initiative and it fails, they are fired, or at least their career takes a big hit. Even if there were a large reward potential the risk may not be worth it to the executive.
What if we optimized this problem and reduce the risk by limiting the potential downsides? Let’s start by identifying the major risks:
- Brand Risk. What if you release your new product and it is a total failure? Your company’s name will be forever tarnished, and sales of existing product will get hurt as a result.
- Financial Risk (both capital and opportunity costs). To avoid Brand Risk we should put extra effort into making sure that the product is fully polished and ready for prime time. We will need teams of people working for multiple months (or even years) on the product development.
- Personal/Political Risk. If I’ve championed this project and it turns out to be a flop, I will suffer public embarrassment and will likely get fired/asked to resign. Hmm, maybe instead I should play it safe and let someone else take that risk if they want to.
Tarnishing your brand, losing money, and getting fired. Yep, those are giant risks. What if I told you that you could reduce each of these risks to manageable amounts? Within a small pocket of your company you can create a culture of innovation. Within this pocket you must encourage experimentation and celebrate failure in the interest of learning. Treat pocket of innovation like a small startup inside your larger organization.
We can solve this problem together. To get started building our new innovation culture we must first accept the fact that we will NOT get the product right on the first try. In fact, you have no idea how many iterations you need to make to get it right.
Failing often is OK as long as we limit the scope of our failures. We are not talking about orders failing to emerge after a lengthy product development cycle and advertising campaign. We’re talking about showing a customer a drawing of the product and hearing that it doesn’t quite do what they need. By limiting the scope of our failures we can decrease the duration of these business model experiments and increase their frequency. Under these conditions innovation is free to thrive.
How liberating! The first try is not what matters – what matters is that you iterate over your business model until you get it right, ultimately achieving Product-Market Fit (hopefully before your competitors and definitely before you run out of money). The other key is to limit spending until you’ve achieved product-market fit – for example, don’t spend money on advertising until you’re confident that the product is viable.
Once the organization accepts this idea it can start making rational choices to minimize the risk associated with the new business opportunity. Let’s look at each of our major risks and see out how to minimize them given these new assumptions about our uncertainty in achieving product-market fit.
- Brand Risk is non-existent. If you build a product that nobody wants (and you haven’t advertised it yet), then nobody even cares! Just resist the urge to create ‘buzz’ around your product launch until you’re pretty sure you’ve gotten it right. You can’t fail to meet expectations if you don’t create any. To further protect the original brand, you could release your new product under a different brand, and only attach the parent brand when you’ve proven the success of the new product.
- Financial Risk – limit spending until achieving Product-Market Fit. The solution to mitigating financial risk is the Minimum Viable Product. The basic idea is to build only what is absolutely necessary in order to validate your assumptions about your business model. If we disprove an assumption, we come up with new assumptions and modify the MVP accordingly in order to validate the new assumptions. Then repeat. Eric Ries calls this the Build-Measure-Learn cycle, and it is the basis for the Lean Startup movement (fig A)We spend as little money as possible on the MVP, and we don’t spend money on advertising activities until we’ve proven product-market fit. Sometimes this means building a single landing page to see how many people click on the ‘sign up’ button. Sometimes it means building a non-functioning prototype and taking it to a few customers to get their feedback. Still, even if we’ve only got a couple of people working to iterate on the MVP, money is being spent, and it can be tough to quantify progress to senior management. The bottom line is that it doesn’t take that many resources to prove whether or not a business idea is viable (and if the idea is not viable you should kill the project).
- Personal/Political Risk can minimized. We’ve minimized brand risk and financial risk, so the political risk for the project champion is also minimized. Under these conditions, senior management should be able to reassure the project champion that her career is safe even if this business experiment does not achieve Product-Market Fit. This gives the project champion the confidence she needs to run without constantly having to watch her back.
The result is a framework for creating innovations in your company, ensuring that it will always be able to move with the market and capitalize on new opportunities.
I’ve seen a lot of different attitudes towards automated testing over the years. They range from zealous acceptance to complete rejection. I would like to cut through the rhetoric, pre-conceptions, and mis-conceptions, and start the conversation from the only place that makes sense.
First, a couple of definitions:
Unit Tests – Low level tests for application logic. Inputs will be mocked or stubbed. These do not make calls to services or external databases. They must run quickly, easily, and frequently. Example: testing a method that calculates taxes based on a matrix of product types and zip codes.
Integration Tests – Application logic tests that may make calls to external services or databases. These are slower and may run less frequently. Example: testing payment submission such that a payment gateway is called, a successful response is received, and that the response is logged to the database.
Functional Tests – Tests that run from a user perspective. Usually this will mean invoking a user interface and testing that it reacts appropriately. Example: testing that the user can login, place an order, and receive a confirmation email. These tests are not the focus of this article
The Business Case: How Does Building Unit Tests Help Me Make Money?
Make no mistake: there are costs associated with building tests, and may require a large upfront investment before any benefit is realized. In fact, to the non-technical manager, this combination of fuzzy benefits and clear costs may make testing seem like a poor investment. That line of thinking is at best short-sighted, and at worst, blatantly negligent to your product’s future costs.
Testing helps your product development cycle move faster
This may be counter-intuitive. As it turns out, a the whole team has much higher confidence that the new features have not broken existing features when those existing features are continuously and automatically tested. The result is a QA cycle with less defects, which finishes faster and gets the product to market quicker. The business gains more confidence in product quality during each product development cycle. The tech team also gains confidence – unafraid to jump in and modify critical parts of the system to meet the latest business need. The tests got yo back.
Tests provide historical proof about how the application is supposed to function
One client I had actually opposed maintaining a unit test suite for their enterprise system. When it came time to make some minor enhancements to the billing system it turned into a big project. The billing logic had become un-maintainable spaghetti from years of monkey patching by developers who had come and gone over the years. In some cases the business people couldn’t even tell us how it was supposed to work. We were literally afraid to modify it. We limped through the project, but started planning a rewrite of the building system as a future project, involving not just the tech team but also business resources to properly document the functionality.
We test because we care about our customers and because we care about our software.
If you’ve read this far, check out my latest project: FindMyBeer , where you can find who sells or serves craft beer near you.
Grails is a great framework framework that enables rapid development with Java. Like with any framework however, you sometimes get stuck and need to take a look under the covers to solve a problem.
Today I was fighting with mapping a many-to-many relationship (a common occurrence in grails), and needed to figure out exactly why Grails and Hibernate were not doing what I expected. One of the great things about Java is that nearly the whole stack is open source so you can just step through the code to see what is going on, as long as you can find the code (and navigate through injected dependencies, but that is a different story).
Eclipse Maven plugins provide great tools to ‘Download Sources’ start viewing them immediately when you step into some third party library code. I am developing a Grails application on SpringSource Tool Suite (STS, v2.8.2 as of this writing), which is becoming the industry standard IDE for Spring and Grails based applications. Since Grails uses a Maven-like dependency management system, you would expect STS to be able to download sources for for any of the grails dependencies easily, right?
While this is a feature that might work for Grails 2.0, if you’re using Grails 1.3.7 then you will find a plugin named eclipse-scripts that enables you to download sources and then configure your projects so that STS can find the sources. Here’s what you do:
1234 grails install-plugin eclipse-scriptsgrails compilegrails download-sources-and-javadocsgrails sts-link-sources-and-javadocs
Then restart STS and refresh your project. Now you can navigate into your project’s Grails Dependencies and view their source through STS!
Credit for creating the eclipse-scripts plugin to Lari Hotari
I gave a talk at the Geneca office back in July and did not realize that it was available on the internet until today.
I couldn’t hold a candle to Brian Green on such topics as Quantum Entaglement, Higgs Boson, or Grand Unified Theory (despite obtaining a B.A. in Physics), however I can apply the scientific method to improving the performance of your software.
In this article I will explain a basic, but often overlooked foundation for improving the performance of any software application.
Much of software development is an art, but performance tuning is a science. I’ve seen a lot of good developers waste time significant amounts of time on performance with little to show for it, or just as bad, improve performance without knowing exactly which change had the desired effect.
Do you remember talking about the Scientific Method from your high school science class? The diagram on the right is a refresher. The scientific method is the repeatable process on which all scientific exploration is based. It gives scientists across the world a common language and framework to compare the process and outcomes of experiments.
The scientific process provides a few of important points that can be applied to software performance optimization:
- Repeatable process – use the same process for every performance enhancement you make
- Only modify one variable at a time – Do not make multiple tweaks at the same time.
- Record the results of each optimization. Track what you did and how much it helped.
This sounds simple right? It is. The tough part for software developers is to never break these rules during a round of optimizations. To the right I’ve also included a more detailed diagram of what the scientific process looks like when applied to performance optimization. Let’s call it the Performance Optimization Method.
But I know what I’m doing! Why shouldn’t I make multiple tweaks at once?
Lets say you do make two changes at once. You optimize two queries and drop the page load time from 3s to .1s. Do you know how much relative impact the changes had? Did each change reduce the cost by the same amount (50%/50%)? Did one query account for most of the cost (75%/25%)? Or did one of the changes not even have any impact (100%/0%)? What if the two changes were somehow interdependent? For the most part these questions are impossible to answer unless you use a repeatable process and only modify one variable at a time. There are exceptions *(there are always exceptions. If you have a good profiling tool that tells you exactly what two different method calls cost and you are absolutely sure they are not somehow related then you could cut a corner and make multiple changes at once. If the results do not turn out as expected you still need to go back and make the changes one at a time). By the way, I hope you are testing against a volume of data you expect in production.
Don’t forget to record the result of each optimization. This way you can throw your results into a table, and with a little explanation about the process and results you turn it into a report and send it to management so they can see how you’re spending their budget (and how good you are at science). Having these sorts of metrics reports also makes it easy for stakeholders to justify the time spent on performance optimization activities.
The law of diminishing returns applies to performance enhancements. At some point you will have picked all of the low-hanging fruit and enhancements start to get progressively more expensive. Stakeholders need insight into how this is progressing on your project so they can make decisions on how much more to spend on performance. Metrics reports should provide sufficient detail for stakeholders to make those decisions.
Ultimately you will end up with a faster application and a clear story of how you got there. Isn’t science fun?
I am working on a project to convert a handful of J2EE applications from an Oracle OC4J application (no longer supported) server to JBoss 5.1.0. Among the many challenges in the conversion is the fact that JBoss’s default profile has a significantly larger memory footprint than OC4J. In the past I have just accepted that Jboss uses over 400MB of heap space before you even deploy anything. This time however we were hoping to reuse the same hardware from the old application server with the new application server. When the test system started paging and eventually using up all of the physical memory available, we were forced to choose between ordering more memory and trying to tune jboss to reduce the memory footprint.
We ended up having a lot of success reducing the footprint through tuning. Bottom line: we reduced the memory footprint by 120MB, and the startup time from 53s to 24s
Here were the steps taken
|Heap Size (MB)||Used (MB)||Reduction in Used (MB)|
|commented out debug level MBeans annotation in deployers.xml||322||247||67|
|removed ejb3 services||317||238||9|
|removed messaging folder & props||310||238||0|
|removed seam & admin-console||256||205||33|
|Removed xnio-deployer and xnio-provider||256||203||2|
The instructions for each step can be found : http://community.jboss.org/wiki/JBoss5xTuningSlimming
Notes on my environment and testing process:
- Windows XP, JDK 1.6.0_22,
- JBoss 5.1.0.GA. Xmx=512M , Xmx=256 (this is why heap didn’t drop below 256)
- I used jvisualvm to watch the heap and “used” memory values
- For the “Used” memory, I took the maximum observed value while JBoss was starting. If you understand that a time vs. memory usage graph follows a sawtooth pattern as objects are instantiated and garbage collected, then I took the value from the tip of the highest tooth.
I’ve recently bought into the idea that the best design is achieved when you can no longer find anything to eliminate while still achieving your goals. This idea also aligns well with the concepts of Lean process improvement, and the constant elimination of waste.
With this in mind, I simplified my blog UI today, switching to this theme from iNove (http://wordpress.org/extend/themes/inove). Its not a big change, but its an improvement.