Compress
Prompt
Given an array of characters
chars
, compress it such that consecutive duplicate characters are replaced with the character followed by the count of duplicates. If the count is 1, omit it.
Example 1
Input: compress(["a", "b", "b", "b", "c"])
Output: ["a", "b", "3", "c"]
Example 2
Input: compress(["a", "a", "b", "b", "c", "c", "c"])
Output: ["a", "2", "b", "2", "c", "3"]
Approach: Sliding window
This is a problem that screams "sliding window" to me, where you iterate through the data and use two pointers within that iteration to compare two things at once, and either increase the size of that window or keep iterating the window based on matches or non-matches. That's the high level idea here, but I'll walk through the specific steps I want to take:
- Ensure array characters are in order,
sortedArray
- Create array to return,
returnArray
- Loop through the array using a sliding window
i
= loop index / start of windowj
= end of window
- If start of window and end of window are the same character, increase the size of the window and repeat comparison
- Otherwise:
- Push the array item into
returnArray
- Calculate the size of the window (
j
-i
),windowSize
- If
windowSize
is greater than 1, pushwindowSize
intoreturnArray
- Reframe the window so that the start of the window
i
is at the end of the old windowj
, and the end of the new window isi + 1
- Push the array item into
- Repeat steps 3 and 4 until
i
has reached the end of the array - Return
returnArray
Ensure array characters are in order
In both of the examples we're provided, the given array of characters is in alphabetical order already: ["a", "b", "b", "b", "c"]
and ["a", "a", "b", "b", "c", "c", "c"]
. This is great, but we can't always rely on this being the case. The first step for a problem like this is about data hygiene, that we get our provided input as clean as possible to help produce the desired outcome.
I am accounting for one edgecase, that the characters in the provided arrays might not be given in alphabetical order, but sorting the data before doing any other work. That looks like this:
function compress(array) { const sortedArray = array.sort();}
This relies on the JavaScript .sort()
array method to reorder our characters alphabetically and drop them into a new variable, sortedArray
.
That said, other edge cases we might need to account for include mixed upper- and lowercase letters, non-string values (like numbers, booleans, etc.), or strings that are longer than one character. In an actual interview these are the things I would ask about and be sure that I account for in my solution.
Create array to return and loop through sortedArray
using a sliding window
With our data cleaned up, we can now do two things: prepare the array we will eventually return from the function and set up our sliding window and loop. Basically at this point we're defining variables and creating our loop:
function compress(array) { const sortedArray = array.sort(); const returnArray = []; let i = 0; let j = 1;
while (i < sortedArray.length) { i += 1; // an initial exit condition }
return returnArray;}
Compare the start of the window with the end of the window and account for matches
As we loop through sortedArray
, i
will be the index of the first item in the array and the start of our window.
j
will start as the index of the second item in the array and will increment if that letter matches the previous letter. This is the end of our window.
If the letters match, j
will increase by 1 and we'll compare the start of the window (which has not changed) with the new end of the window. We repeat this until sortedArray[j]
does not match sortedArray[i]
:
function compress(array) { const sortedArray = array.sort(); const returnArray = []; let i = 0; let j = 1;
// Don't run this code, there's no exit condition and will cause an infinite loop while (i < sortedArray.length) { if (sortedArray[i] === sortedArray[j]) { j += 1; } }
return returnArray;}
Account for non-matches
Now we need to account for when the start and end of the window don't match. Here we need to do several things:
First, we need to push the letter into returnArray
; at this point we can access this from the start of the window (sortedArray[i]
):
returnArray.push(sortedArray[i]);
Next, we should figure out how many elements are inside of the window. We can just subtract the indices of j
and i
for this, and save it into a variable. I'll name it windowSize
.
The prompt requirements want us to not have windowSize
in returnArray
if windowSize
is 1
, but we do need to have windowSize
if it's greater than 1
, so we'll check that and push into returnArray
accordingly:
const windowSize = j - i;if (windowSize > 1) { returnArray.push(windowSize);}
However, if we push windowSize
as in, we are pushing a Number
into the array, where the prompt requirements indicate they want a String
. So we'll need to coerce windowSize
to a String
representation. I'll do so with Number()
:
const windowSize = j - i;if (windowSize > 1) { returnArray.push(String(windowSize));}
Finally, we can proceed with our loop, and to do so we need to reframe our window. We want the new start of our window to be where the old window ended, and then have the new end of our window be the array item after that:
i = j;j += 1;
This is all of our while
loop logic, and it will keep this sliding window pattern up until the start of the window has reached the end sortedArray
. The last thing to do, outside of our while
loop, is to return returnArray
:
return returnArray;
Solution
Having walked through everything, here is the full solution all put together:
function compress(array) { const sortedArray = array.sort(); const returnArray = []; let i = 0; let j = 1;
while (i < sortedArray.length) { if (sortedArray[i] === sortedArray[j]) { j += 1; } else { returnArray.push(sortedArray[i]); const windowSize = j - i; if (windowSize > 1) { returnArray.push(String(windowSize)); } i = j; j += 1; } }
return returnArray;}
Big O of this solution
The biggest downside to my solution is the .sort()
step. JavaScript's .sort()
array method carries with it a typical time complexity of O(n log n)
, so as-is that is the overall time complexity. That said, I am making an assumption that we might get non-sequential characters. In a real-world interview, I would clarify if that is the case or not.
If that weren't the case, and I could ditch the .sort()
method, then the time complexity reduces down to linear O(n)
, which is great. The while
loop goes through the length of the provided data once, and each item in the array is looked at for comparison once (technically twice when we reframe the windows).
As for space complexity, I am creating two new arrays: the sortedArray()
and the returnArray()
. sortedArray()
will always take up the same amount of memory as the provided array (making it O(n)
) and sortedArray()
will take up the same amount of memory as the provided array in the worst cases, though more often than not will be smaller. We'd call that O(n)
. With both of these new arrays, that would be 2O(n)
which I believe just gets simplified to O(n)
, so we have linear space complexity.
I could improve this by modifying the given array in place, which perhaps by the wording of the prompt I should have done. Admittedly, modifying data in place makes me nervous, so I tend to avoid doing so. Again, this is something I would clarify in an interview.
Alternate approach
I'm sure there are lots of ways of solving this problem. One that comes to mind is to use a set
. Something like the following (generated by Google search AI answer):
function deduplicateAndCountDuplicates(array) { const counts = {}; const uniqueArray = [];
for (const item of array) { counts[item] = (counts[item] || 0) + 1; if (counts[item] === 1) { uniqueArray.push(item); } }
const duplicates = Object.fromEntries( Object.entries(counts).filter(([, count]) => count > 1) );
return { uniqueArray, duplicates };}
const arrayWithDuplicates = [1, 2, 2, 3, 4, 4, 4, 5];const result = deduplicateAndCountDuplicates(arrayWithDuplicates);
console.log("Unique array:", result.uniqueArray);// Expected output: Unique array: [1, 2, 3, 4, 5]
console.log("Duplicates:", result.duplicates);// Expected output: Duplicates: { '2': 2, '4': 3 }
Obviously the data is a little different here, but the broad strokes are there. I would probably try to refactor this to put any of the result.duplicates
that are greater than 1
into the result.uniqueArray
. I am excited to see others' solutions!