Random subset of list including specific element(s) in Python

Started by
6 comments, last by sooner123 9 years, 4 months ago
I would like to generate a random subset that contains a specific element.
For example, given a list of the letters of the alphabet. Generate a 5 element sample that includes the 3rd element.

alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', ...]
wantedSubset = random.sample(alphabet, 5)
This does everything I want except for guaranteeing that 'c' will be an element.
Advertisement

You can make a one element list, with the letter you want in there, and then append it with random stuff. Make sure wanted element is not there twice (if you don't want it there twice).

This would have the problem that the wanted element might appear twice.

So is there a way to take a sample from a list, such that the sample won't contain a specific element?

Also if I do it this way, assuming there is a way to generate a sample that doesn't contain a specific element, I then have to insert the wanted element randomly into the sample.

This seems like a very convoluted approach.

You can remove your selected elements from the list, pick a random subset of the appropriate size from the remaining elements and then append it to your elements.

I don't know any Python, but that shouldn't be hard to do in any language.

Well that's how I've been doing it. What I don't like is that I have to take extra steps to re-insert that element into the sub-sample so that it's in a random location.

And just appending it then scrambling the sample seems wasteful since generating the sample already involved a permutation in the first place.

Is deleting wanted element, taking sub sample of remainder of list, appending wanted element to sub sample, and scrambling the sub sample really the best way to do it?

The way I'm appending it now is:

permuList.insert(randrange(len(permuList)+1), element)
There is rarely a "best" way of doing anything. There are just different ways with different trade offs.

To understand any trade-offs you're making, we'd need to understand the broader requirements. How frequently is this called? How important is the random distribution of elements in the new list? Is the "required" element always the same, or is that generated dynamically, and if so, how?

Unless this was in a hot loop being called thousands of times a second, I'd probably do something similar to what has been described.

If you know the index of c, you could do something like this:


wantedSubset = random.sample(alphabet[:indexOfElement] + alphabet[indexOfElement+1:], 4)
wantedSubset.insert(randomIndex, alphabet[indexOfElement])

I doubt you're gonna get anything much simpler than that. This isn't really a common sort of operation, so there's not likely to be a magic function that just does it.

Thanks all

This topic is closed to new replies.

Advertisement