LeetCode: Pascal's Triangle C#yt k L QqYy যQ J Nn
https://leetcode.com/explore/interview/card/top-interview-questions-easy/99/others/601/

In Pascal's triangle, each number is the sum of the two numbers directly above it.
Example:
Input: 5
Output:
[
[1],
[1,1],
[1,2,1],
[1,3,3,1],
[1,4,6,4,1]
]
Please review for performance.
using System.Collections.Generic;
using System.Linq;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace RecurssionQuestions
{
/// <summary>
/// https://leetcode.com/explore/interview/card/top-interview-questions-easy/99/others/601/
/// </summary>
[TestClass]
public class PascalTriangleTest
{
[TestMethod]
public void TestMethod1()
{
IList<IList<int>> result = PascalTriangle.Generate(5);
IList<IList<int>> expected = new List<IList<int>>();
expected.Add(new List<int> { 1 });
expected.Add(new List<int> { 1, 1 });
expected.Add(new List<int> { 1, 2, 1 });
expected.Add(new List<int> { 1, 3, 3, 1 });
expected.Add(new List<int> { 1, 4, 6, 4, 1 });
for (int i = 0; i < expected.Count; i++)
{
CollectionAssert.AreEqual(expected[i].ToList(), result[i].ToList());
}
}
}
public class PascalTriangle
{
public static IList<IList<int>> Generate(int numRows)
{
IList<IList<int>> result = new List<IList<int>>();
if (numRows == 0)
{
return result;
}
List<int> row = new List<int>();
row.Add(1);
result.Add(row);
if (numRows == 1)
{
return result;
}
for (int i = 1; i < numRows; i++)
{
var prevRow = result[i - 1];
row = new List<int>();
row.Add(1);
for (int j = 0; j < prevRow.Count - 1; j++)
{
row.Add(prevRow[j] + prevRow[j + 1]);
}
row.Add(1);
result.Add(row);
}
return result;
}
}
}
-
1\\$\\begingroup\\$ To calculate the n th row, you need the n-1 th row to be calculated. This means your algorithm can only calculate top-down, and can not be extended to calculate a section or a single row. You might want to read about it here: en.wikipedia.org/wiki/…. \\$\\endgroup\\$ – dfhwze 10 hours ago
5 Answers
I wouldn't bother with the numRows == 1 special case (it is redundant and just adds code), but you should be checking that numRows >= 0 and throwing an argument exception if it is not (and ideally testing for that as well); currently it treats numRows < 0 as numRows == 1 which makes no sense at all.
As always, this would benefit from inline documentation, qualifying what the method does, and what it expects of the caller (e.g. a non-negative number of rows).
You might consider var for result and row. Within the method it is fine for result to be List<Ilist<int>>, and will have the nice side-effect of removing a little unnecessary indirection (I'm assuming the CLR won't/can't optimise that away).
You might consider using the List(int) constructor to improve the memory characteristics of your method, since you know in an advance how large each list will be. Alternatively, if performance matters enough that you will actually benchmark this, then consider an immutable return type (IReadOnlyList<IReadOnlyList<int>>) and use arrays directly.
I don't like the spacing in your code: I would at the very least separate the == 0 and == 1 cases with an empty line. They are logically distinct, and having them mushed together fails to convey this.
TestMethod1 is a poor name, and the text fails to check that the length of the output is correct (it could return 6 rows and you'd never know). Tests for edge cases (0, 1) and invalid arguments (-1) are important.
Your row variable is all over the place. I would put it inside the loop, and add the first one directly as result.Add(new List<int>() { 1 }). As it is, you could use prevRow = row, which would avoid a lookup.
Again, if you really care about performance, you should be benchmarking it with a realistic use case (unfortunately those don't exist for such tasks, and optimisation is basically pointless), but you could avoid making 2 lookups from the previous row for the 'inner' entries. Something like this should work:
// inside for (i...)
int left = 0;
for (int j = 0; j < i - 1; j++)
{
int right = prevRow[j];
row.Add(left + right);
left = right;
}
row.Add(1);
results.Add(row)
This would also allow you to remove the numRows == 1 special case, which would be amusing. Alternatively you could start at j = 1 and bake the row.Add(1) first entry.
You might also consider a lazy version producing an IEnumerable<IReadOnlyList<int>>: this would probably be slower than a non-lazy version for small inputs, but would enable you to handle larger triangles when you would otherwise run out of memory (would be linear rather than quadratic), and the improved memory characteristics may even lead to better performance.
-
\\$\\begingroup\\$ I am not sure whether we could gain performance for the bigger rows with this, but since each row is reflective around its centre, perhaps we should only calculate half the row and reflect the other half .. planetmath.org/… \\$\\endgroup\\$ – dfhwze 10 hours ago
-
2\\$\\begingroup\\$ +1. Some quick testing shows that using a list's capacity can make things 20-50% faster (less gain for higher numbers of rows), whereas using arrays directly is consistently more than 80% faster. Caching the previous row's left value is about 10-20% faster, and making use of the reflective nature of rows can yield another 10% improvement. \\$\\endgroup\\$ – Pieter Witvoet 7 hours ago
-
\\$\\begingroup\\$ @PieterWitvoet would you mind sharing your benchmark code? (since you've bothered to test it, that would be worthy of an answer in my book). I'm most surprised by the array performance: I didn't expect that much overhead from
List<T>(or was thatIList<T>?) \\$\\endgroup\\$ – VisualMelon 6 hours ago -
\\$\\begingroup\\$ @VisualMelon: sure, here you go. \\$\\endgroup\\$ – Pieter Witvoet 5 hours ago
-
1\\$\\begingroup\\$ You guys are great! Thanks \\$\\endgroup\\$ – Gilad 5 hours ago
Something I ran into while making my own implementation:
After 34 rows, the highest number in the row will be two 1166803110s. Adding these for row 35 exceeds the maximum value for ints, resulting in integer-overflow.
You might consider putting the line that does the addition into a checked block, so that an OverflowException is properly thrown:
for (int j = 0; j < prevRow.Count - 1; j++)
{
checked
{
row.Add(prevRow[j] + prevRow[j + 1]);
}
}
-
1\\$\\begingroup\\$ +1 I probably should have though about this before even mentioning memory... \\$\\endgroup\\$ – VisualMelon 6 hours ago
-
\\$\\begingroup\\$ @VisualMelon thanks, it actually surprised me how quickly the values grew. The maximum for each row nearly doubles. \\$\\endgroup\\$ – JAD 1 hour ago
VisualMelon already said everything I wanted to say and more, but at his request, here are the variants I tested. Using list capacity is about 20-50% faster (less gain for higher numbers of rows), while using arrays is consistently more than 80% faster. Left-value caching is about 10-20% faster, and making use of the reflective nature of rows (as mentioned by dfhwze) yields another 10% improvement.
It's also possible to calculate the contents of a row without first calculating all previous rows. It's roughly twice as slow per row because it involves multiplication and division instead of addition, but it's a lot more efficient if you only need a specific row.
// Original approach, modified to cache the previous row's left value:
public static IList<IList<int>> Generate_Improved(int numRows)
{
IList<IList<int>> result = new List<IList<int>>();
if (numRows == 0)
{
return result;
}
List<int> row = new List<int>();
row.Add(1);
result.Add(row);
if (numRows == 1)
{
return result;
}
for (int i = 1; i < numRows; i++)
{
var prevRow = result[i - 1];
row = new List<int>();
var left = 0;
for (int j = 0; j < prevRow.Count; j++)
{
int right = prevRow[j];
row.Add(left + right);
left = right;
}
row.Add(1);
result.Add(row);
}
return result;
}
// Original approach, modified to use list capacities:
public static IList<IList<int>> Generate_Capacity(int numRows)
{
IList<IList<int>> result = new List<IList<int>>(numRows);
if (numRows == 0)
{
return result;
}
List<int> row = new List<int>(1);
row.Add(1);
result.Add(row);
if (numRows == 1)
{
return result;
}
for (int i = 1; i < numRows; i++)
{
var prevRow = result[i - 1];
row = new List<int>(i + 1);
row.Add(1);
for (int j = 0; j < prevRow.Count - 1; j++)
{
row.Add(prevRow[j] + prevRow[j + 1]);
}
row.Add(1);
result.Add(row);
}
return result;
}
// Original approach, modified to use list capacities and caching the previous row's left value:
public static IList<IList<int>> Generate_Capacity_Improved(int numRows)
{
IList<IList<int>> result = new List<IList<int>>(numRows);
if (numRows == 0)
{
return result;
}
List<int> row = new List<int>(1);
row.Add(1);
result.Add(row);
if (numRows == 1)
{
return result;
}
for (int i = 1; i < numRows; i++)
{
var prevRow = result[i - 1];
row = new List<int>(i + 1);
var left = 0;
for (int j = 0; j < prevRow.Count; j++)
{
int right = prevRow[j];
row.Add(left + right);
left = right;
}
row.Add(1);
result.Add(row);
}
return result;
}
// Using arrays instead of lists:
public static IList<IList<int>> Generate_Array(int numRows)
{
var result = new int[numRows][];
if (numRows == 0)
return result;
result[0] = new int[] { 1 };
if (numRows == 1)
return result;
for (int i = 1; i < numRows; i++)
{
var prevRow = result[i - 1];
var row = new int[i + 1];
row[0] = 1;
for (int j = 0; j < i - 1; j++)
row[j + 1] = prevRow[j] + prevRow[j + 1];
row[i] = 1;
result[i] = row;
}
return result;
}
// Using arrays instead of lists, and caching the previous row's left value:
public static IList<IList<int>> Generate_Array_Improved(int numRows)
{
var result = new int[numRows][];
if (numRows == 0)
return result;
result[0] = new int[] { 1 };
if (numRows == 1)
return result;
for (int i = 1; i < numRows; i++)
{
var prevRow = result[i - 1];
var row = new int[i + 1];
var left = 0;
for (int j = 0; j < i; j++)
{
int right = prevRow[j];
row[j] = left + right;
left = right;
}
row[i] = 1;
result[i] = row;
}
return result;
}
// Using arrays instead of lists, caching the previous row's left value, and using row reflection:
public static IList<IList<int>> Generate_Array_Improved_Reflective(int numRows)
{
var result = new int[numRows][];
if (numRows == 0)
return result;
result[0] = new int[] { 1 };
if (numRows == 1)
return result;
for (int i = 1; i < numRows; i++)
{
var prevRow = result[i - 1];
var row = new int[i + 1];
var left = 0;
var mid = (i / 2) + 1;
for (int j = 0; j < mid; j++)
{
int right = prevRow[j];
var sum = left + right;
row[j] = sum;
row[i - j] = sum;
left = right;
}
result[i] = row;
}
return result;
}
// Using arrays, calculating each row individually:
public static IList<IList<int>> Generate_PerRow(int numRows)
{
var result = new int[numRows][];
for (int i = 0; i < numRows; i++)
{
var row = new int[i + 1];
row[0] = 1;
for (int j = 0; j < i; j++)
row[j + 1] = row[j] * (i - j) / (j + 1);
result[i] = row;
}
return result;
}
// Using arrays, calculating each row individually, using row reflection:
public static IList<IList<int>> Generate_PerRow_Reflective(int numRows)
{
var result = new int[numRows][];
for (int i = 0; i < numRows; i++)
{
var row = new int[i + 1];
row[0] = 1;
row[i] = 1;
var mid = i / 2;
for (int j = 0; j < mid; j++)
{
var value = row[j] * (i - j) / (j + 1);
row[j + 1] = value;
row[i - j - 1] = value;
}
result[i] = row;
}
return result;
}
-
\\$\\begingroup\\$ I wonder how much of the benefit from arrays is because it doesn't indirect through
IList; does swappingIList<IList<int>> result = new List<IList<int>>();forvar result = new List<IList<int>>();make any difference? \\$\\endgroup\\$ – VisualMelon 5 hours ago -
\\$\\begingroup\\$ @VisualMelon: good point.
IList<IList<int>> resultis twice as slow compared toint[][] result. \\$\\endgroup\\$ – Pieter Witvoet 5 hours ago
According to performance then if you use arrays for the rows, you can maybe gain a couple of milliseconds by taking advantage of the triangular symmetry by only iterate to the half of the previous line:
public static IList<IList<int>> Generate(int numRows)
{
IList<IList<int>> result = new List<IList<int>>(numRows);
if (numRows <= 0)
{
return result;
}
int[] row = { 1 };
var prevRow = row;
for (int i = 1; i <= numRows; i++)
{
row = new int[i];
row[0] = 1;
int half = prevRow.Length / 2;
for (int j = 0; j < half; j++)
{
int val = prevRow[j] + prevRow[j + 1];
row[j + 1] = val;
row[i - j - 2] = val;
}
row[i - 1] = 1;
result.Add(row);
prevRow = row;
}
return result;
}
I haven't measured very thoroughly but found only a minor improvement. And I couldn't measure any differences whether having half and val as locals or not. Anyway it's doable and signals maybe a "deeper" understanding of the nature of the problem :-)
-
2\\$\\begingroup\\$ You might have missed it, Pieter Wilvoet implemented and benchmarked this at my suggestion :) Performance was increased by 10%. \\$\\endgroup\\$ – dfhwze 42 mins ago
-
1\\$\\begingroup\\$ @dfhwze: ah!, sorry, I didn't read all the comments - just didn't see it as part of the answers. \\$\\endgroup\\$ – Henrik Hansen 38 mins ago
-
\\$\\begingroup\\$ No problem, it's a good answer nevertheless :) \\$\\endgroup\\$ – dfhwze 36 mins ago
If performance is your goal you'd be better off with an identity function and better memory management.
The additional benefit here, is that you can jump to the 53rd line Line(52) without having to calculate the preceding triangle.
public static class Pascal
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ReadOnlySpan<long[]> Triangle(int n)
{
if (n < 1)
{
return ReadOnlySpan<long[]>.Empty;
}
Span<long[]> triangle = new long[n][];
for (var r = 0; r < n; r++)
{
triangle[r] = Line(r).ToArray();
}
return triangle;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ReadOnlySpan<long> Line(int n)
{
if (n < 0)
{
return ReadOnlySpan<long>.Empty;
}
if (n == 0)
{
return (ReadOnlySpan<long>)new long[]
{
1L
};
}
Span<long> result = new long[n + 1];
result[0] = 1L;
for (var k = 0; k < n; k++)
{
result[k + 1] = result[k] * (n - k) / (k + 1);
}
return result;
}
}