Related
I got the following problem for the Google Coding Challenge which happened on 16th August 2020. I tried to solve it but couldn't.
There are N words in a dictionary such that each word is of fixed
length and M consists only of lowercase English letters, that is
('a', 'b', ...,'z') A query word is denoted by Q. The length
of query word is M. These words contain lowercase English letters
but at some places instead of a letter between 'a', 'b', ...,'z'
there is '?'. Refer to the Sample input section to understand this
case. A match count of Q, denoted by match_count(Q) is the
count of words that are in the dictionary and contain the same English
letters(excluding a letter that can be in the position of ?) in the
same position as the letters are there in the query word Q. In other
words, a word in the dictionary can contain any letters at the
position of '?' but the remaining alphabets must match with the
query word.
You are given a query word Q and you are required to compute
match_count.
Input Format
The first line contains two space-separated integers N and M denoting the number of words in the dictionary and length of each word
respectively.
The next N lines contain one word each from the dictionary.
The next line contains an integer Q denoting the number of query words for which you have to compute match_count.
The next Q lines contain one query word each.
Output Format For each query word, print match_count for a specific word in a new line.
Constraints
1 <= N <= 5X10^4
1 <= M <= 7
1 <= Q <= 10^5
So, I got 30 minutes for this question and I could write the following code which is incorrect and hence didn't give the expected output.
def Solve(N, M, Words, Q, Query):
output = []
count = 0
for i in range(Q):
x = Query[i].split('?')
for k in range(N):
if x in Words:
count += 1
else:
pass
output.append(count)
return output
N, M = map(int , input().split())
Words = []
for _ in range(N):
Words.append(input())
Q = int(input())
Query = []
for _ in range(Q):
Query.append(input())
out = Solve(N, M, Words, Q, Query)
for x in out_:
print(x)
Can somebody help me with some pseudocode or algorithm which can solve this problem, please?
I guess my first try would have been to replace the ? with a . in the query, i.e. change ?at to .at, and then use those as regular expressions and match them against all the words in the dictionary, something as simple as this:
import re
for q in queries:
p = re.compile(q.replace("?", "."))
print(sum(1 for w in words if p.match(w)))
However, seeing the input sizes as N up to 5x104 and Q up to 105, this might be too slow, just as any other algorithm comparing all pairs of words and queries.
On the other hand, note that M, the number of letters per word, is constant and rather low. So instead, you could create Mx26 sets of words for all letters in all positions and then get the intersection of those sets.
from collections import defaultdict
from functools import reduce
M = 3
words = ["cat", "map", "bat", "man", "pen"]
queries = ["?at", "ma?", "?a?", "??n"]
sets = defaultdict(set)
for word in words:
for i, c in enumerate(word):
sets[i,c].add(word)
all_words = set(words)
for q in queries:
possible_words = (sets[i,c] for i, c in enumerate(q) if c != "?")
w = reduce(set.intersection, possible_words, all_words)
print(q, len(w), w)
In the worst case (a query that has a non-? letter that is common to most or all words in the dictionary) this may still be slow, but should be much faster in filtering down the words than iterating all the words for each query. (Assuming random letters in both words and queries, the set of words for the first letter will contain N/26 words, the intersection for the first two has N/26² words, etc.)
This could probably be improved a bit by taking the different cases into account, e.g. (a) if the query does not contain any ?, just check whether it is in the set (!) of words without creating all those intersections; (b) if the query is all-?, just return the set of all words; and (c) sort the possible-words-sets by size and start the intersection with the smallest sets first to reduce the size of temporarily created sets.
About time complexity: To be honest, I am not sure what time complexity this algorithm has. With N, Q, and M being the number of words, number of queries, and length of words and queries, respectively, creating the initial sets will have complexity O(N*M). After that, the complexity of the queries obviously depends on the number of non-? in the queries (and thus the number of set intersections to create), and the average size of the sets. For queries with zero, one, or M non-? characters, the query will execute in O(M) (evaluating the situation and then a single set/dict lookup), but for queries with two or more non-?-characters, the first set intersections will have on average complexity O(N/26), which strictly speaking is still O(N). (All following intersections will only have to consider N/26², N/26³ etc. elements and are thus negligible.) I don't know how this compares to The Trie Approach and would be very interested if any of the other answers could elaborate on that.
This question can be done by the help of Trie Data Structures.
First add all words to trie ds.
Then you have to see if the word is present in trie or not, there's a special condition of ' ?' So you have to take care for that condition also, like if the character is ? then simply go to next character of the word.
I think this approach will work, there's a similar Question in Leetcode.
Link : https://leetcode.com/problems/design-add-and-search-words-data-structure/
It should be O(N) time and space approach given M is small and can be considered constant. You might want to look at implementation of Trie here.
Perform the first pass and store the words in Trie DS.
Next for your query, you perform a combination of DFS and BFS in the following order.
If you receive a ?, Perform BFS and add all the children.
For non ?, Perform a DFS and that should point to the existence of a word.
For further optimization, a suffix tree may also be used for storage DS.
You can use a simplified version of trie as the query string has pre-defined length. No need of ends variable in the Trie node
#include <bits/stdc++.h>
using namespace std;
typedef struct TrieNode_ {
struct TrieNode_* nxt[26];
} TrieNode;
void addWord(TrieNode* root, string s) {
TrieNode* node = root;
for(int i = 0; i < s.size(); ++i) {
if(node->nxt[s[i] - 'a'] == NULL) {
node->nxt[s[i] - 'a'] = new TrieNode;
}
node = node->nxt[s[i] - 'a'];
}
}
void matchCount(TrieNode* root, string s, int& cnt) {
if(root == NULL) {
return;
}
if(s.empty()) {
++cnt;
return;
}
TrieNode* node = root;
if(s[0] == '?') {
for(int i = 0; i < 26; ++i) {
matchCount(node->nxt[i], s.substr(1), cnt);
}
}
else {
matchCount(node->nxt[s[0] - 'a'], s.substr(1), cnt);
}
}
int main() {
int N, M;
cin >> N >> M;
vector<string> s(N);
TrieNode *root = new TrieNode;
for (int i = 0; i < N; ++i) {
cin >> s[i];
addWord(root, s[i]);
}
int Q;
cin >> Q;
for(int i = 0; i < Q; ++i) {
string queryString;
int cnt = 0;
cin >> queryString;
matchCount(root, queryString, cnt);
cout << cnt << endl;
}
}
Notes: 1. This code doesn't read the input but instead takes params from main method.
2. For large inputs, we could use java 8 streams to parallelize the search process and improve the performance.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class WordSearch {
private void matchCount(int N, int M, int Q, String[] words, String[] queries) {
Pattern p = null;
Matcher m = null;
int count = 0;
for (int i=0; i<Q; i++) {
p = Pattern.compile(queries[i].replace('?','.'));
for (int j=0; j<N; j++) {
m = p.matcher(words[j]);
if (m.find()) {
count++;
}
}
System.out.println("For query word '"+ queries[i] + "', the count is: " + count) ;
count=0;
}
System.out.println("\n");
}
public static void main(String[] args) {
WordSearch ws = new WordSearch();
int N = 5; int M=3; int Q=4;
String[] w = new String[] {"cat", "map", "bat", "man", "pen"};
String[] q = new String[] {"?at", "ma?", "?a?", "??n" };
ws.matchCount(N, M, Q, w, q);
w = new String[] {"uqqur", "1xzev", "ydfgz"};
q = new String[] {"?z???", "???i?", "???e?", "???f?", "?z???"};
N=3; M=5; Q=5;
ws.matchCount(N, M, Q, w, q);
}
}
I can think of kind of trie with bfs for lookup approach
class Node:
def __init__(self, letter):
self.letter = letter
self.chidren = {}
#classmethod
def construct(cls):
return cls(letter=None)
def add_word(self, word):
current = self
for letter in word:
if letter not in current.chidren:
node = Node(letter)
current.chidren[letter] = node
else:
node = current.chidren[letter]
current = node
def lookup_word(self, word, m):
def _lookup_next_letter(_letter, _node):
if _letter == '?':
for node in _node.chidren.values():
q.put((node, i))
elif _letter in _node.chidren:
q.put((_node.chidren[_letter], i))
q = SimpleQueue()
count = 0
i = 0
current = self
letter = word[i]
i += 1
_lookup_next_letter(letter, current)
while not q.empty():
current, i = q.get()
if i == m:
count += 1
continue
letter = word[i]
i += 1
_lookup_next_letter(letter, current)
return count
def __eq__(self, other):
return self.letter == other.letter if isinstance(other, Node) else other
def __hash__(self):
return hash(self.letter)
I would create a lookup table for each letter of each word, and then use that table to iterate with. While the lookup table will cost O(NM) memory (or 15 entries in the situation shown), it will allow an easy O(NM) time complexity to be implemented, with a best case O(log N * log M).
The lookup table can be stored in the form of a coordinate plane. Each letter will have an "x" position (the letters index) as well as a "y" position (the words index in the dictionary). This will allow a quick cross reference from the query to look up a letter's position for existence and the word's position for eligibility.
Worst case, this approach has a time complexity O(NM) whereby there must be N iterations, one for each dictionary entry, times M iterations, one for each letter in each entry. In many cases it will skip the lookups though.
A coordinate system is also created, which also has O(NM) spacial complexity.
Unfamiliar with python, so this is written in JavaScript which was as close as I could come language wise. Hopefully this at least serves as an example of a possible solution.
In addition, as an added section, I included a heavily loaded section to use for performance comparisons. This takes about 5 seconds to complete a set with 2000 words, 5000 querys, each at a length of 200.
// Main function running the analysis
function run(dict, qs) {
// Use a coordinate system for tracking the letter and position
var coordinates = 'abcdefghijklmnopqrstuvwxyz'.split('').reduce((p, c) => (p[c] = {}, p), {});
// Populate the system
for (var i = 0; i < dict.length; i++) {
// Current word in the given dictionary
var dword = dict[i];
// Iterate the letters for tracking
for (var j = 0; j < dword.length; j++) {
// Current letter in our current word
var letter = dword[j];
// Make sure that there is object existence for assignment
coordinates[letter][j] = coordinates[letter][j] || {};
// Note the letter's coordinate by storing its array
// position (i) as well as its letter position (j)
coordinates[letter][j][i] = 1;
}
}
// Lookup the word letter by letter in our coordinate system
function match_count(Q) {
// Create an array which maps from the dictionary indices
// to a truthy value of 1 for tracking successful matches
var availLookup = dict.reduce((p,_,i) => (p[i]=1,p),{});
// Iterate the letters of Q to check against the coordinate system
for (var i = 0; i < Q.length; i++) {
// Current letter in Q
var letter = Q[i];
// Skip '?' characters
if (letter == '?') continue;
// Look up the existence of "points" in our coordinate system for
// the current letter
var points = coordinates[letter];
// If nothing from the dictionary matches in this position,
// then there are no matches anywhere and we return a 0
if (!points || !points[i]) return 0;
// Iterate the availability truth table made earlier
// and look up whether any points in our coordinate system
// are present for the current letter. If they are, then the word
// remains, if not, it is removed from consideration.
for(var n in availLookup){
if(!points[i][n]) delete availLookup[n];
}
}
// Sum the "truthy" 1 values we used earlier to determine the count of
// matched words
return Object.values(availLookup).reduce((x, y) => x + y, 0);
}
var matches = [];
for (var i = 0; i < qs.length; i++) {
matches.push(match_count(qs[i]));
}
return matches;
}
document.querySelector('button').onclick=_=>{
console.clear();
var d1 = [
'cat',
'map',
'bat',
'man',
'pen'
];
var q1 = [
'?at',
'ma?',
'?a?',
'??n'
];
console.log('running...');
console.log(run(d1, q1));
var d2 = [
'uqqur',
'lxzev',
'ydfgz'
];
var q2 = [
'?z???',
'???i?',
'???e?',
'???f?',
'?z???'
];
console.log('running...');
console.log(run(d2, q2));
// Load it up (try this with other versions to compare with efficiency)
var d3 = [];
var q3 = [];
var wordcount = 2000;
var querycount = 5000;
var len = 200;
var alphabet = 'abcdefghijklmnopqrstuvwxyz'.split('');
for(var i = 0; i < wordcount; i++){
var word = "";
for(var n = 0; n < len; n++){
var rand = (Math.random()*25)|0;
word += alphabet[rand];
}
d3.push(word);
}
for(var i = 0; i < querycount; i++){
var qword = d3[(Math.random()*(wordcount-1))|0];
var query = "";
for(var n = 0; n < len; n++){
var rand = (Math.random()*100)|0;
if(rand > 98){ word += alphabet[(Math.random()*25)|0]; }
else{ query += rand > 75 ? qword[n] : '?'; }
}
q3.push(query);
}
if(document.querySelector('input').checked){
//console.log(d3,q3);
console.log('running...');
console.log(run(d3, q3).reduce((x, y) => x + y, 0) + ' matches');
}
};
<input type=checkbox>Include the ~5 second larger version<br>
<button type=button>run</button>
I don't know Python, but the gist of the naive algorithm looks like this:
#count how many words in Words list match a single query
def DoQuery(Words, OneQuery):
count = 0
#for each word in the Words list
for i in range(Words.size()):
word = Words.at(i)
#compare each letter to the query
match = true
for j in range(word.size()):
wordLetter = word.at(j)
queryLetter = OneQuery.at(j)
#if the letters do not match and are not ?, then skip to next word
if queryLetter != '?' and queryLetter != wordLetter:
match = false
break
#if we did not skip, the words match. Increase the count
if match == true
count = count + 1
#we have now checked all the words, return the count
return count
Of course, this executes the innermost loop around 3.5x10^10 times, which might be too slow. So one would need to read in the dictionary, precompute some short of shortcut data structure, then use the shortcut to find the answers faster.
One shortcut data structure would be to make a map of possible queries to answers, making the query O(1). There are only 4.47*10^9 possible queries, so this is possibly faster.
A similar shortcut data structure would be to make a trie of possible queries to answers, making the query O(M). There are only 4.47*10^9 possible queries, so this is possibly faster. This is more complex code, but may also be easier to understand for some people.
Another shortcut would be to "assume" each query has exactly one non-question-mark, and make a map of possible queries to subset dictionaries. This would mean you'd still have to run the naive query on the subset dictionary, but it would be ~26x smaller, and thus ~26x faster. You'd also have to convert the real query into only having one non-question-mark to lookup the subset dictionary in the map, but that should be easy.
I think we can use trie to solve this problem.
Initially, we will just add all the strings to the trie, and later when we get each query we can just check whether it exists in trie or not.
The only thing different here is the '?' but we can use it as an all char match, so whenever we will detect the '?' in our search string we will look what are all possible words possible from here and then simply do a dfs by searching the word in all possible paths.
Below is the C++ code
class Trie {
public:
bool isEnd;
vector<Trie*> children;
Trie() {
this->isEnd = false;
this->children = vector<Trie*>(26, nullptr);
}
};
Trie* root;
void insert(string& str) {
int n = str.size(), idx, i = 0;
Trie* node = root;
while(i < n) {
idx = str[i++] - 'a';
if (node->children[idx] == nullptr) {
node->children[idx] = new Trie();
}
node = node->children[idx];
}
node->isEnd = true;
}
int getMatches(int i, string& str, Trie* node) {
int idx, n = str.size();
while(i < n) {
if (str[i] >= 'a' && str[i] <='z')
idx = str[i] - 'a';
else {
int res = 0;
for(int j = 0;j<26;j++) {
if (node->children[j] != nullptr)
res += getMatches(i+1, str, node->children[j]);
}
return res;
}
if (node->children[idx] == nullptr) return 0;
node = node->children[idx];
++i;
}
return node->isEnd ? 1 : 0;
}
int main() {
int n, m;
cin>>n>>m;
string str;
root = new Trie();
while(n--) {
cin>>str;
insert(str);
}
int q;
cin>>q;
while(q--) {
cin>>str;
cout<<(str.size() == m ? getMatches(0, str, root) : 0)<<"\n";
}
}
Can I do it with ascii values like:
for charcters in queryword calculate the ascii values sum.
for words in dictionary, calculate ascii of words character wise and check it with ascii sum of query word, like for bat, if ascii of b matches ascii sum of queryword then increment count else calculate ascii of a and check with query ascii if not then add it to ascii of b then check and hence atlast return the count.
How's this approach?
Java Implementation using Trie
import java.util.*;
import java.io.*;
import java.lang.*;
public class Main {
static class TrieNode
{
TrieNode []children = new TrieNode[26];
boolean endOfWord;
TrieNode()
{
this.endOfWord = false;
for (int i = 0; i < 26; i++) {
this.children[i] = null;
}
}
void addWord(String word)
{
// Crawl pointer points the object
// in reference
TrieNode pCrawl = this;
// Traverse the given array of words
for (int i = 0; i < word.length(); i++) {
int index = word.charAt(i) - 'a';
if (pCrawl.children[index]==null)
pCrawl.children[index]
= new TrieNode();
pCrawl = pCrawl.children[index];
}
pCrawl.endOfWord = true;
}
public static int ans2 = 0;
void search(String word, boolean found, String curr_found, int pos)
{
TrieNode pCrawl = this;
if (pos == word.length()) {
if (pCrawl.endOfWord) {
found = true;
ans2++;
}
return;
}
if (word.charAt(pos) == '?') {
// Iterate over every letter and
// proceed further by replacing
// the character in place of '.'
for (int i = 0; i < 26; i++) {
if (pCrawl.children[i] != null) {
pCrawl.children[i].search(word,found,curr_found + (char)('a' + i),pos + 1);
}
}
}
else { // Check if pointer at character
// position is available,
// then proceed
if (pCrawl.children[word.charAt(pos) - 'a'] != null) {
pCrawl.children[word.charAt(pos) - 'a']
.search(word,found,curr_found + word.charAt(pos),pos + 1);
}
}
return;
}
// Utility function for search operation
int searchUtil(String word)
{
TrieNode pCrawl = this;
boolean found = false;
ans2 = 0;
pCrawl.search(word, found,"",0);
return ans2;
}
}
static int searchPattern(String arr[], int N,String str)
{
// Object of the class Trie
TrieNode obj = new TrieNode();
for (int i = 0; i < N; i++) {
obj.addWord(arr[i]);
}
// Search pattern
return obj.searchUtil(str);
}
public static void ans(String []arr , int n, int m,String [] query, int q){
for(int i=0;i<q;i++)
System.out.println(searchPattern(arr,n,query[i]));
}
public static void main(String args[]) {
Scanner scn = new Scanner();
int n = scn.nextInt();
int m = scn.nextInt();
String []arr = new String[n];
for(int i=0;i<n;i++){
arr[i] = scn.next();
}
int q = scn.nextInt();
String []query = new String[q];
for(int i=0;i<q;i++){
query[i] = scn.next();
}
ans(arr,n,m,query,q);
}
}
This is brute but Trie is a better implemntaion.
"""
Input: db whic is a list of words
chk : str to find
"""
def check(db,chk):
seen = collections.defaultdict(list)
for i in db:
for j in range(len(i)):
temp = i[:j] + "?" + i[j+1:]
seen[temp].append(i)
return len(seen[chk])
print check(["cat","bat"], "?at")
Sounds like it was a coding challenge about https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff
Depending on parameters N,M,Q as well as data and query distribution, the "best" algorithm will be different. A simple example, given the query ??? you know the answer — the length of the dictionary — without any computation 😸
In the general case, most likely, it pays to create a search index in advance (that is while reading the dictionary, before any query is seen).
I'd go with this: number the input 0 cat; 1 map; ...
Then build a search index per letter position:
index = [
{"c": 0b00001, "m": 0b00010, ...} # first query letter
{"a": 0b01111, "e": 0x10000} # second query letter
]
Prepare all = 0x11111 (all bits set) as "matches everything".
Then query lookup: ?a? ⇒ all & index[1]["a"] & all. †
Afterwards you'll need to count number of bits set in the result.
The time complexity of single query is therefore O(N) * (M + O(1)) ‡, which is a decent trade-off.
The entire batch is O(N*M*Q).
Python (as well as es2020) supports native arbitrary precision integers, which can be elegantly used for bitmaps, as well as native dictionaries, use them :) However if the data is sparse, an adaptive or compressed bitmap such as https://pypi.org/project/roaringbitmap may perform better.
† In practice ... & index[1].get("a", 0) & ... in case you hit a blank.
‡ Python data structure time complexity is reported O(...) amortised worst case while in CS O(...) worst case is usually considered. While the difference is subtle, it can bite even experienced developers, see e.g. https://bugs.python.org/issue13703
One approach could be to use Python's fnmatch module (for every pattern sum the matches in words):
import fnmatch
names = ['uqqur', 'lxzev', 'ydfgs']
patterns = ['?z???', '???i?', '???e?', '???f?', '?z???']
[sum(fnmatch.fnmatch(name, pattern) for name in names) for pattern in patterns]
# [0, 0, 1, 0, 0]
Recently HackerRank launched their own certifications. Among the tests they offer is "Problem Solving". The test contains 2 problems; they give you 90 minutes to solve them. Being inexperienced as I am, I failed, because it took me longer than that.
Specifically, I came up with the solution for the first problem (filled orders, see below) in, like 30 minutes, and spent the rest of the time trying to debugg it. The problem with it wasn't that the solution didn't work, but that it worked on only some of the test cases.
Out of 14 testcases the solution worked on 7 (including all the open ones and a bunch of closed ones), and didn't work on the remaining 7 (all closed). Closed means that the input data is not available, as well as expected output. (Which makes sense, because some of the lists there included 250K+ elements.)
But it drives me crazy; I can't figure out what might be wrong with it. I tried putting print statements all over the place, but the only thing I came to is that 1 too many elements get added to the list - hence, the last if statement (to drop the last added element), but it made no difference whatsoever, so it's probably wrong.
Here's the problem:
A widget manufacturer is facing unexpectedly high demand for its new product,. They would like to satisfy as many customers as possible. Given a number of widgets available and a list of customer orders, what is the maximum number of orders the manufacturer can fulfill in full?
Function Description
Complete the function filledOrders in the editor below. The function must return a single integer denoting the maximum possible number of fulfilled orders.
filledOrders has the following parameter(s):
order : an array of integers listing the orders
k : an integer denoting widgets available for shipment
Constraints
1 ≤ n ≤ 2 x 105
1 ≤ order[i] ≤ 109
1 ≤ k ≤ 109
Sample Input For Custom Testing
2
10
30
40
Sample Output
2
And here's my function:
def filledOrders(order, k):
total = k
fulf = []
for r in order:
if r <= total:
fulf.append(r)
total -= r
else:
break
if sum(fulf) > k:
fulf.pop()
return len(fulf)
Java Solution
int count = 0;
Collections.sort(order);
for(int i=0; i<order.size(); i++) {
if(order.get(i)<=k) {
count++;
k = k - order.get(i);
}
}
return count;
Code Revision
def filledOrders(order, k):
total = 0
for i, v in enumerate(sorted(order)):
if total + v <= k:
total += v # total stays <= k
else:
return i # provides the count
else:
return len(order) # was able to place all orders
print(filledOrders([3, 2, 1], 3)) # Out: 2
print(filledOrders([3, 2, 1], 1)) # Out: 1
print(filledOrders([3, 2, 1], 10)) # Out: 3
print(filledOrders([3, 2, 1], 0)) # Out: 0
Advanced Javascript solution :
function filledOrders(order, k) {
// Write your code here
let count = 0; let total=0;
const ordersLength = order.length;
const sortedOrders = order.sort(function(a,b) {
return (+a) - (+b);
});
for (let i = 0; i < ordersLength; i++) {
if (total + sortedOrders[i] <= k) {
// if all orders able to be filled
if (total <= k && i === ordersLength - 1) return ordersLength;
total += sortedOrders[i];
count++;
} else {
return count;
}
}
}
Python code
def filledOrders(order, k):
orderfulfilled=0
for i in range(1,len(order)):
m=k-order[i]
if(m>=0):
orderfulfilled+=1
k-=order[i]
return(orderfulfilled)
Javascript solution
Option1:
function filledOrders(order, k) {
let count=0;
let arr= [];
arr = order.sort().filter((item, index) => {
if (item<=k) {
k = k - item;
return item
}
})
return arr.length
}
Option2:
function filledOrders(order, k) {
let count=0;
for(var i=0; i<order.sort().length; i++) {
if(order[i]<=k) {
count++;
k = k - order[i]
}
}
return count;
}
C#
using System.CodeDom.Compiler;
using System.Collections.Generic;
using System.Collections;
using System.ComponentModel;
using System.Diagnostics.CodeAnalysis;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Reflection;
using System.Runtime.Serialization;
using System.Text.RegularExpressions;
using System.Text;
using System;
using System.Reflection.Metadata.Ecma335;
class Result
{
/*
* Complete the 'filledOrders' function below.
*
* The function is expected to return an INTEGER.
* The function accepts following parameters:
* 1. INTEGER_ARRAY order
* 2. INTEGER k
*/
public static int filledOrders(List<int> order, int k)
{
if (order.Sum() <= k)
{
return order.Count();
}
else
{
int counter = 0;
foreach (int element in order)
{
if (element <= k)
{
counter++;
k = k - element;
}
}
return counter;
}
}
}
class Solution
{
public static void Main(string[] args)
{
int orderCount = Convert.ToInt32(Console.ReadLine().Trim());
List<int> order = new List<int>();
for (int i = 0; i < orderCount; i++)
{
int orderItem = Convert.ToInt32(Console.ReadLine().Trim());
order.Add(orderItem);
}
int k = Convert.ToInt32(Console.ReadLine().Trim());
var orderedList = order.OrderBy(a=>a).ToList();
int result = Result.filledOrders(orderedList, k);
Console.WriteLine(result);
}
}
I think, the better way to approach (to decrease time complexity) is to solve without use of sorting. (Ofcourse, that comes that cost of readability)
Below is a solution without use of sort. (Not sure if I covered all edge cases.)
import os, sys
def max_fulfilled_orders(order_arr, k):
# track the max no.of orders in the arr.
max_num = 0
# order count, can be fulfilled.
order_count = 0
# iter over order array
for i in range(0, len(order_arr)):
# if remain value < 0 then
if k - order_arr[i] < 0:
# add the max no.of orders to total
k += max_num
if order_count > 0:
# decrease order_count
order_count -= 1
# if the remain value >= 0
if(k - order_arr[i] >= 0):
# subtract the current no.of orders from total.
k -= order_arr[i]
# increase the order count.
order_count += 1
# track the max no.of orders till the point.
if order_arr[i] > max_num:
max_num = order_arr[i]
return order_count
print(max_fulfilled_orders([3, 2, 1], 0)) # Out: 0
print(max_fulfilled_orders([3, 2, 1], 1)) # Out: 1
print(max_fulfilled_orders([3, 1, 1], 2)) # Out: 2
print(max_fulfilled_orders([3, 2, 4], 9)) # Out: 3
print(max_fulfilled_orders([3, 2, 1, 4], 10)) # Out: 4
In python,
def order_fillers(order,k):
if len(order)==0 or k==0:
return 0
order.sort()
max_orders=0
for item in order:
if k<=0:
return max_orders
if item<=k:
max_orders+=1
k-=item
return max_orders
JavaScript Solution
function filledOrders(order, k) {
let total = 0;
let count = 0;
const ordersLength = order.length;
const sortedOrders = order.sort();
for (let i = 0; i < ordersLength; i++) {
if (total + sortedOrders[i] <= k) {
// if all orders able to be filled
if (total <= k && i === ordersLength - 1) return ordersLength;
total += sortedOrders[i];
count++;
} else {
return count;
}
}
}
// Validation
console.log(filledOrders([3, 2, 1], 3)); // 2
console.log(filledOrders([3, 2, 1], 1)); // 1
console.log(filledOrders([3, 2, 1], 10)); // 3
console.log(filledOrders([3, 2, 1], 0)); // 0
console.log(filledOrders([3, 2, 2], 1)); // 0
I need to convert a list of ints to a string containing all the ranges in the list.
So for example, the output should be as follows:
getIntRangesFromList([1,3,7,2,11,8,9,11,12,15]) -> "1-3,7-9,11-12,15"
So the input is not sorted and there can be duplicate values. The lists range in size from one element to 4k elements. The minimum and maximum values are 1 and 4094.
This is part of a performance critical piece of code. I have been trying to optimize this, but I can't find a way to get this faster. This is my current code:
def _getIntRangesFromList(list):
if (list==[]):
return ''
list.sort()
ranges = [[list[0],list[0]]] # ranges contains the start and end values of each range found
for val in list:
r = ranges[-1]
if val==r[1]+1:
r[1] = val
elif val>r[1]+1:
ranges.append([val,val])
return ",".join(["-".join([str(y) for y in x]) if x[0]!=x[1] else str(x[0]) for x in ranges])
Any idea on how to get this faster?
This could be a task for the itertools module.
import itertools
list_num = [1, 2, 3, 7, 8, 9, 11, 12, 15]
groups = (list(x) for _, x in
itertools.groupby(list_num, lambda x, c=itertools.count(): x - next(c)))
print(', '.join('-'.join(map(str, (item[0], item[-1])[:len(item)])) for item in groups))
This will give you 1-3, 7-9, 11-12, 15.
To understand what's going on you might want to check the content of groups.
import itertools
list_num = [1, 2, 3, 7, 8, 9, 11, 12, 15]
groups = (list(x) for _, x in
itertools.groupby(list_num, lambda x, c=itertools.count(): x - next(c)))
for element in groups:
print('element={}'.format(element))
This will give you the following output.
element=[1, 2, 3]
element=[7, 8, 9]
element=[11, 12]
element=[15]
The basic idea is to have a counter running parallel to the numbers. groupby will create individual groups for numbers with the same numerical distance to the current value of the counter.
I don't know whether this is faster on your version of Python. You'll have to check this yourself. In my setting it's slower with this data set, but faster with a bigger number of elements.
The fastest one I could come up, which tests about 10% faster than your solution on my machine (according to timeit):
def _ranges(l):
if l:
l.sort()
return ''.join([(str(l[i]) + ('-' if l[i] + 1 == l[i + 1] else ','))
for i in range(0, len(l) - 1) if l[i - 1] + 2 != l[i + 1]] +
[str(l[-1])])
else: return ''
The above code assumes that the values in the list are unique. If they aren't, it's easy to fix but there's a subtle hack which will no longer work and the end result will be slightly slower.
I actually timed _ranges(u[:]) because of the sort; u is 600 randomly selected integers from range(1000) comprising 235 subsequences; 83 are singletons and 152 contain at least two numbers. If the list is sorted, quite a lot of time is saved.
def _to_range(l, start, stop, idx, result):
if idx == len(l):
result.append((start, stop))
return result
if l[idx] - stop > 1:
result.append((start, stop))
return _to_range(l, l[idx], l[idx], idx + 1, result)
return _to_range(l, start, l[idx], idx + 1, result)
def get_range(l):
if not l:
return []
return _to_range(l, start = l[0], stop = l[0], idx = 0, result = [])
l = [1, 2, 3, 7, 8, 9, 11, 12, 15]
result = get_range(l)
print(result)
>>> [(1, 3), (7, 9), (11, 12), (15, 15)]
# I think it's better to fetch the data as it is and if needed, change it
# with
print(','.join('-'.join([str(start), str(stop)]) for start, stop in result))
>>> 1-3,7-9,11-12,15-15
Unless you don't care at all about the data, then u can just append str(start) + '-' + str(stop) in _to_range function so later there will be no need to type extra '-'.join method.
I'll concentrate on the performance that is your main issue. I'll give 2 solutions:
1) If the boundaries of the integers stored is between A and B, and you can create an array of booleans(even you can choose an array of bits for expanding the range you can storage) with (B - A + 2) elements, e.g. A = 0 and B = 1 000 000, we can do this (i'll write it in C#, sorry XD). This run in O(A - B) and is a good solution if A - B is less than the number of numbers:
public string getIntRangesFromList(int[] numbers)
{
//You can change this 2 constants
const int A = 0;
const int B = 1000000;
//Create an array with all its values in false by default
//Last value always will be in false in propourse, as you can see it storage 1 value more than needed for 2nd cycle
bool[] apparitions = new bool[B - A + 2];
int minNumber = B + 1;
int maxNumber = A - 1;
int pos;
for (int i = 0; i < numbers.Length; i++)
{
pos = numbers[i] - A;
apparitions[pos] = true;
if (minNumber > pos)
{
minNumber = pos;
}
if (maxNumber < pos)
{
maxNumber = pos;
}
}
//I will mantain the concatenation simple, but you can make it faster to improve performance
string result = "";
bool isInRange = false;
bool isFirstRange = true;
int firstPosOfRange = 0; //Irrelevant what is its initial value
for (int i = minNumber; i <= maxNumber + 1; i++)
{
if (!isInRange)
{
if (apparitions[i])
{
if (!isFirstRange)
{
result += ",";
}
else
{
isFirstRange = false;
}
result += (i + A);
isInRange = true;
firstPosOfRange = i;
}
}
else
{
if (!apparitions[i])
{
if (i > firstPosOfRange + 1)
{
result += "-" + (i + A - 1);
}
isInRange = false;
}
}
}
return result;
}
2) O(N * log N)
public string getIntRangesFromList2(int[] numbers)
{
string result = "";
if (numbers.Length > 0)
{
numbers.OrderBy(x => x); //sorting and making the algorithm complexity O(N * log N)
result += numbers[0];
int countNumbersInRange = 1;
for (int i = 1; i < numbers.Length; i++)
{
if (numbers[i] != numbers[i - 1] + 1)
{
if (countNumbersInRange > 1)
{
result += "-" + numbers[i - 1];
}
result += "," + numbers[i];
countNumbersInRange = 1;
}
else
{
countNumbersInRange++;
}
}
}
return result;
}
I have been given a set S, of n integers, and have to print the size of a maximal subset S' of S where the sum of any 2 numbers in S' are not evenly divisible by k.
Input Format
The first line contains 2 space-separated integers, n and k, respectively.
The second line contains n space-separated integers describing the unique values of the set.
My Code :
import sys
n,k = raw_input().strip().split(' ')
n,k = [int(n),int(k)]
a = map(int,raw_input().strip().split(' '))
count = 0
for i in range(len(a)):
for j in range(len(a)):
if (a[i]+a[j])%k != 0:
count = count+1
print count
Input:
4 3
1 7 2 4
Expected Output:
3
My Output:
10
What am i doing wrong? Anyone?
You can solve it in O(n) time using the following approach:
L = [0]*k
for x in a:
L[x % k] += 1
res = 0
for i in range(k//2+1):
if i == 0 or k == i*2:
res += bool(L[i])
else:
res += max(L[i], L[k-i])
print(res)
Yes O(n) solution for this problem is very much possible. Like planetp rightly pointed out its pretty much the same solution I have coded in java. Added comments for better understanding.
import java.io.; import java.util.;
public class Solution {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int n=in.nextInt();
int k=in.nextInt();
int [] arr = new int[k];
Arrays.fill(arr, 0);
Map<Integer,Integer> mp=new HashMap<>();
Storing the values in a map considering there are no duplicates. You can store them in array list if there are duplicates. Only then you have different results.
for(int i=0;i
int res=0;
for(int i=0;i<=(k/2);i++)
{
if(i==0 || k==i*2)
{
if(arr[i]!=0)
res+=1;
}
If the no. is divisible by k we can have only one and if the no is exactly half of k then we can have only 1. Rational if a & b are divisble by k then a+b is also divisible by k. Similarly if c%k=k/2 then if we have more than one such no. their combination is divisible by k. Hence we restrict them to 1 value each.
else
{
int p=arr[i];
int q=arr[k-i];
if(p>=q)
res+=p;
else
res+=q;
}
This is simple figure out which is more from a list of 0 to k/2 in the list if a[x]>a[k-x] get the values which is greater. i.e. if we have k=4 and we have no. 1,3,5,7,9,13,17. Then a[1]=4 and a[3]=2 thus pick a[1] because 1,5,13,17 can be kept together.
}
System.out.println(res);
}
}
# given k, n and a as per your input.
# Will return 0 directly if n == 1
def maxsize(k, n, a):
import itertools
while n > 1:
sets = itertools.combinations(a, n)
for set_ in sets:
if all((u+v) % k for (u, v) in itertools.combinations(set_, 2)):
return n
n -= 1
return 0
Java solution
public class Solution {
static PrintStream out = System.out;
public static void main(String[] args) {
/* Enter your code here. Read input from STDIN. Print output to STDOUT. Your class should be named Solution. */
Scanner in = new Scanner (System.in);
int n = in.nextInt();
int k = in.nextInt();
int[] A = new int[n];
for(int i=0;i<n;i++){
A[i]=in.nextInt();
}
int[] R = new int[k];
for(int i=0;i<n;i++)
R[A[i] % k]+=1;
int res=0;
for(int i=0;i<k/2+1;i++){
if(i==0 || k==i*2)
res+= (R[i]!=0)?1:0;
else
res+= Math.max(R[i], R[k-i]);
}
out.println(res);
}
}
I am absolutely a noob on python, and just started to practice on leetcode. Anyway take a look at this TwoSum exercise: Given an array of integers, find two numbers such that they add up to a specific target number.
Here is my code for this exercise:
class Solution(object):
def __init__(self, nums, target):
self.nums = nums
self.target = target
def twoSum(self):
for i in range(len(self.nums)):
for j in range(i+1, len(self.nums)):
if self.nums[i] + self.nums[j] == self.target:
print "index1=" + str(i) + ", index2=" + str(j)
sample = Solution([2, 8, 7, 15], 9)
sample.twoSum()
Anyone can help me how should the leetcode algorithm answer be look like? Will this one be OK for an interview? Thanks
I wouldn't consider your code or the itertools solution acceptable, because they are both O(n^2). If given in an interview, the interviewer probably wants to see that you can do better than just running two nested for loops.
I would use a hash table or sort the array and then binary search the answer.
Hash table pseudocode
h = empty hash table
for each element e in array
if target - e in hash table:
we have a solution
add e to hash table
This will have complexity O(n), subject to some hash table quirks: in the worst case, it can be O(n^2), but that shouldn't happen for random inputs.
Binary search pseudocode
sort array
for each element e in array
binary search for target - e, make sure it's not found at the same index
if it is, check the before / after element
or think how you can avoid this happening
This will always be O(n log n).
If complexity doesn't matter, then the itertools solution is nice, but your code also gets the job done.
This code is acceptable in an interview, but in real life you should learn to know the libraries. In this example it's itertools.combinations:
from itertools import combinations
for item in combinations([2, 8, 7, 15], 2):
if sum(item) == 9:
print item # prints (2, 7)
Brute Force (Naive), Time Complexity O(n^2):
class Solution:
def twoSum(self, nums, target):
for i in range(0, len(nums)):
to_find = target-nums[i]
for j in range(0, len(nums)):
if j!=i and nums[j] == to_find:
return [i, j]
return [-1, -1]
Using Sorting, Time Complexity O(nlogn):
class Solution:
def twoSum(self, nums, target):
nums = sorted(nums)
for i in range(len(nums)):
to_find = target - nums[i]
left, ryt = 0, len(nums)-1
while left<ryt:
mid = (left+ryt)//2
if mid != i and nums[mid] == to_find:
return [i, mid]
elif nums[mid]>to_find:
ryt = mid-1
else:
left = mid+1
return [-1, -1]
Using Sorting with Two Pointer Approach, Time Complexity O(nlogn):
Improved version of above sorting approach but still Time Complexity is O(nlogn)
Using Hashmap, Time Complexity O(n):
class Solution:
def twoSum(self, nums, target):
num_to_idx = {}
for i, num in enumerate(nums):
if target-num in num_to_idx:
return [i, num_to_idx[target-num]]
num_to_idx[num] = i
return [-1, -1]
Your solution used nested loop which may cause timeout error. You can use dictionary to optimize performance.
This is my solution:
class Solution:
def twoSum(self, num, target):
map = {}
for i in range(len(num)):
if num[i] in map:
return map[num[i]], i
else:
map[target - num[i]] = i
return -1, -1
What's more, you should never modify public method signature.
use hash table is the easiest solution:
class Solution(object):
def twoSum(self, nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
d = {}
for i, n in enumerate(nums):
if n in d:
return [d[n], i]
d[target - n] = i
enjoy
The thing is many interviewers ask to solve the problem in O(n) time complexity.
Here is a tip:- if interviewers ask you to reduce time complexity from O(n^2) to O(n) or O(n^3) to O(n^2) you can guess that you have to use hash table in such case for 60% of the time. You can easily solve the twoSum problem using hash table (in python it is called dictionary) to solve it. Here is my solution of the problem in python which is 94% faster than accepted ones:-
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
l = len(nums)
my_dic = {}
for i in range(l):
t = target - nums[i]
v=my_dic.get(t,-1)
if v!=-1:
if my_dic[t]!=i:
return [my_dic[t],i]
else:
my_dic[nums[i]] = i
return []
You can ask me any question if you don't understand the solution
I have used a dictionary to store value/index as searching for a key in a dictionary can be done in O(1).
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
dict_nums = dict()
for index, value in enumerate(nums):
reminder = target - value
if reminder in dict_nums:
return [dict_nums[reminder], index]
dict_nums[value] = index
Time complexity: O(n)
Space complexity: O(n)
a = {}
i = 0
while i < len(nums):
if nums[i] not in a.values():
a[i] = target - nums[i]
else:
keys = list(a.keys())
vals = list(a.values())
key = keys[vals.index(nums[i])]
return [i,key]
i += 1
return
Using a dictionary you could solve in better time complexity
There were Binary Search suggestions mentioned in other answers, but there was no implementation for them. So I decided to do this.
First I sort array of numbers and then do a binary search. For each number n I binary-search for target - n in sorted array.
Overall time complexity is O(N * Log(N)).
Try it online!
def two_sums(nums, target):
def binary_search(f, a, b):
while a < b:
m = (a + b) // 2
if f(m):
b = m
else:
a = m + 1
assert a == b and f(a), (a, b, f(a))
return a
nums = sorted(enumerate(nums), key = lambda e: e[1])
for i, n in nums:
begin, end = [
binary_search(lambda j: j >= len(nums) or
fcmp(nums[j][1], target - n), 0, len(nums))
for fcmp in [lambda a, b: a >= b, lambda a, b: a > b]
]
for j in range(begin, end):
if nums[j][0] != i:
return sorted([nums[j][0], i])
print(two_sums([2, 8, 7, 15], 9))
Output:
[0, 2]
C++
class Solution {
public:
vector<int> twoSum(vector<int>& nums, int target) {
unordered_map<int, int> numToIndex;
for (int i = 0; i < nums.size(); ++i) {
if (numToIndex.count(target - nums[i]))
return {numToIndex[target - nums[i]], i};
numToIndex[nums[i]] = i;
}
throw;
}
};
C++
class Solution {
public:
vector<int> twoSum(vector<int>& nums, int target) {
map<int, int> umap;
int difference;
for(int i = 0; i < nums.size(); i++ ){
difference = target - nums.at(i);
if(umap.find(difference) != umap.end()) {
vector<int> v{umap[difference], i};
return v;
} else {
umap[nums.at(i)] = i;
}
}
return vector<int> {};
}
};
C++
class Solution {
public:
vector<int> twoSum(vector<int>& nums, int target) {
unordered_map<int, int> m;
vector<int> result;
for (int i=0; i<nums.size(); i++) {
if ( m.find(target - nums[i]) == m.end() ) {
m[nums[i]] = i;
}else{
result.push_back(m[target - nums[i]]);
result.push_back(i);
}
}
return result;
}
};
C++
class Solution {
public:
vector<int> twoSum(vector<int> &nums, int target) {
unordered_map<int, int> index_map;
for (int index = 0;; index++) {
auto iter = index_map.find(target - nums[index]);
if (iter != index_map.end())
return vector<int> {index, iter -> second};
index_map[nums[index]] = index;
}
}
};
C
/**
* Note: The returned array must be malloced, assume caller calls free().
*/
int* twoSum(int* nums, int numsSize, int target, int* returnSize){
*returnSize=2;
int *arr=(int *)malloc((*returnSize)*sizeof(int));
for(int i=0;i<numsSize;i++){
for(int j=i+1;j<numsSize;j++){
if(nums[i]+nums[j]==target){
arr[0]=i;
arr[1]=j;
break;
}
}
}
return arr;
}
JAVA
class Solution {
public int[] twoSum(int[] nums, int target) {
Map<Integer, Integer> numToIndex = new HashMap<>();
for (int i = 0; i < nums.length; ++i) {
if (numToIndex.containsKey(target - nums[i]))
return new int[] {numToIndex.get(target - nums[i]), i};
numToIndex.put(nums[i], i);
}
throw new IllegalArgumentException();
}
}
JAVA
class Solution {
public int[] twoSum(int[] nums, int target) {
int[] indices = new int[2];
Map<Integer, Integer> map = new HashMap<>();
for (int index = 0; index < nums.length; index++) {
if (map.containsKey(target - nums[index])) {
indices[1] = index;
indices[0] = map.get(target - nums[index]);
return indices;
}
map.put(nums[index], index);
}
return indices;
}
}
JAVA
class Solution {
public int[] twoSum(int[] nums, int target) {
if(nums==null || nums.length<2)
return new int[]{0,0};
HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
for(int i=0; i<nums.length; i++){
if(map.containsKey(nums[i])){
return new int[]{map.get(nums[i]), i};
}else{
map.put(target-nums[i], i);
}
}
return new int[]{0,0};
}
}
Python
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
x = len(nums)
for i in range(x-1):
for j in range(1,x):
if i == j:
continue
if nums[i] + nums[j] == target:
return [i,j]
Python
class Solution:
def twoSum(self, nums, target):
length = len(nums)
for i in range(length):
for j in range(i + 1, length):
if nums[i] + nums[j] == target:
return [i, j]
Python
class Solution:
def twoSum(self, nums, target):
index_map = {}
for index, num in enumerate(nums):
if target - num in index_map:
return index_map[target - num], index
index_map[num] = index
Python
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
numToIndex = {}
for i, num in enumerate(nums):
if target - num in numToIndex:
return numToIndex[target - num], i
numToIndex[num] = i
javascript
/**
* #param {number[]} nums
* #param {number} target
* #return {number[]}
*/
var twoSum = function (nums, target) {
// Array to store the result
result = [];
// Map to store the difference and its index
index_map = new Map();
// Loop for each element in the array
for (let i = 0; i < nums.length; i++) {
let difference = target - nums[i];
if (index_map.has(difference)) {
result[0] = i;
result[1] = index_map.get(difference);
break;
} else {
index_map.set(nums[i], i);
}
}
return result;
};
GOlang
func twoSum(nums []int, target int) []int {
record := make(map[int]int)
for index, num := range nums {
difference := target - num
if res, ok := record[difference]; ok {
return []int{index, res}
}
record[num] = index
}
return []int{}
}
Kotlin
class Solution {
fun twoSum(nums: IntArray, target: Int): IntArray {
// Array to store result
val result = IntArray(2)
// This map will store the difference and the corresponding index
val map: MutableMap<Int, Int> = HashMap()
// Loop through the entire array
for (i in nums.indices) {
// If we have seen the current element before
// It means we have already encountered the other number of the pair
if (map.containsKey(nums[i])) {
// Index of the current element
result[0] = i
// Index of the other element of the pair
result[1] = map[nums[i]]!!
break
} else {
// Save the difference of the target and the current element
// with the index of the current element
map[target - nums[i]] = i
}
}
return result
}
}
You can ask me any question if you don't understand the solution