(When you say "efficient", you probably need to be more explicit in terms of space, and time. Lets assume you mean time efficiency (given that you mentioned permutations)).
The task of computing the answer to
String[] findStringsContaining(List<String> strings, String[] words)
can be partitioned and handed off to parallel threads of execution, given that it is purely functional and side effect free in an intermediate stage, and, the results joined as a final step. I.e. you can partition across the words, and/or, the list of strings.
This is how map-reduce works (and your case, its irrelevant that its all happening on the same machine.)
Your mapper (assigned to a thread for each of the words) is:
boolean [] stringContainsWord (List<String> strings, String word);
That method would execute in parallel.
The boolean array would then have a true for each index (of List) that matched for the given word.
and your reducer (running after all mappers have finished) is:
List<String> getMatchingList(List<String>, List<boolean[]> mapperResults);
Putting aside the overhead for the threads and assuming negligible costs for mapper thread counts for a reasonable number of input words, this would give you a O(n) (for the mapper) +O(m) (for the reducer) time process, where n is the number of items in your list of strings, and m is the number of words in your input.
You can further parallelize the task by partitioning your list of Strings and running p threads for each of the words, and having each thread searching a sub-set of your list of Strings, so that the input List to your mapper is 1/p elements of the overall list.
--
An alternative approach which you may want to consider, specially if the list of strings is huge, and, the content is langauge (such as English), is to optimize given the fact that most languages have a fairly small set of words that constitute the bulk of sentences in that language. For example, if your list has 2 million English sentences, chances are that the list of unique words is many orders of magnitude smaller (say a few hundred).
In that case, you can have a Map of word -> sentences, and testing for matching sentences for any given word is reduced to a lookup in the map.
(Note that you can still combine the initial approach with this.)