Instead of looking for a real-world dataset, you should design a small, specific dataset for each unit test. The dataset should provide the minimal necessary precondition to verify a single feature of the system. This will make it easier to detect bugs, maintain tests over time, and demonstrate the capabilities and usage patterns of the system to other developers.
An example from a different domain would be tests for a User Subsystem that creates and validates logins to a website.
addsNewUser
- empty dataset
throwsExceptionForDuplicateUsername
- single-user dataset
correctPasswordPasses
- same dataset
throwsExceptionForIncorrectUsername
- same dataset
throwsExceptionForIncorrectPassword
- same dataset
throwsExceptionWhenNewUsernameExists
- two-user dataset
Update: If you need a very large dataset to perform integration or performance testing, you are probably left with writing a program to generate a random collection of purchases. I doubt any existing supermarkets are willing (or able) to part with their real datasets.
That being said, while working as a contractor for a health insurance provider many years ago (pre-HIPAA) I was given a sample dataset to work with. It contained real patient information including SSNs and confidential medical history. :(