We’ve made it a general rule to move away from relying on fixtures in our projects. The main reasons are:
- Fixtures are fragile. They often break when the schema changes or even worse they appear to work but introduce subtle bugs.
- Extra work is sometimes needed in order to make fixtures portable (for example defining natural keys).
- Processing large fixtures can be very slow, which slows down installation and testing cycles.
- Other smart people in the community are recommending the same approach.
We look for good tools that are usually classified as either data generation or fixture factory. We’ve had some success with django-whatever and wanted to share a few tips. Some of the benefits of django-whatever:
- Generating one or many instances of a Model can be done in a line or two of code.
- Easy to handle things like non-standard primary keys or recursive relationships
- Using a random generator, you get fuzz testing for free.
Like all powerful tools, it is easy to accidentally inflict pain. A few lessons we’ve learned:
- If you have tests that fail randomly, it may be caused by a field on the model receiving random values whereas the tests are expecting a constant value.
- Use any_user when creating User instances.
Finally, we’ve been discussing lately the merits of using something more declarative like factory-boy or perhaps finding another way to generate a full range of fuzz values without being so coupled to the unit tests. Lately we’ve found that patterns of similarity emerge between the parts of the unit test code and the data generation code. Once you have a solid base, it becomes frequently helpful to be able to use code to quickly generate real-world scenarios with complex relationships for testing and feedback.
What are your ideas about managing complex fixture data generation?