Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
689 views
in Technique[技术] by (71.8m points)

excel - Delete large number of rows (e.g. ~500k rows) based on a certain criteria

I have a large number of rows and columns (e.g. 500k rows and 20 columns) all filled with numbers.

I'm trying to delete all data in column I that has a certain value (e.g. less than or equal to 8), but when I try to use autofilter to delete the values, it freezes up Excel and doesn't delete.

It works quickly for data in column A. I remade similar data in a new sheet to make sure all cells were filled, no columns/rows were hidden etc.

Why is it freezing up for column I?

Sub DeleteRow()

    Application.DisplayAlerts = False
    Application.Calculation = xlCalculationManual
    Application.ScreenUpdating = False

    Dim ws As Worksheet
    Dim rng As Range
    Dim lastRow As Long

    Set ws = ActiveWorkbook.Sheets("Sheet1")

    'filter and delete all but header row which is in row 3
    lastRow = ws.Range("I" & ws.Rows.count).End(xlUp).row
    MsgBox lastRow
    Set rng = ws.Range("I3:I" & lastRow)

    ' filter and delete all but header row
    With rng
         .AutoFilter Field:=1, Criteria1:="<=8"
         .SpecialCells(xlCellTypeVisible).EntireRow.Delete
    End With

    ' turn off the filters
    If ActiveSheet.FilterMode Then
        ActiveSheet.ShowAllData
    End If

    Application.DisplayAlerts = True
    Application.Calculation = xlCalculationAutomatic
    Application.ScreenUpdating = True
End Sub
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There are a lot of posts on SO about deleting rows, some good, some not so good.

Two common ones are the Autofilter (which you are using) and building a range with Union (one of which David has linked you to).

For a data set of this size and this many deletions, you will find any method that uses references to Excel worksheet methods (such as AutoFilter, Find, Sort, Union, Formula's etc) slow. Some will be better than others, depending on the exact nature of your data.

There is another method that may work for you. That is to not actually Delete the rows, but to overwrite the data with a modified version.

Note that this only work if you DO NOT have any formulas (either on this sheet or any other) that refer to the data being processed.

I ran this code on a sample data set 500k rows, 20 columns of random numbers 1..32 (so about 25% or rows deleted)

This ran in ~10s

Sub DeleteRows2()
    Dim ws As Worksheet
    Dim rng As Range
    Dim i As Long, j As Long
    Dim NewI As Long
    Dim dat, NewDat

    Dim TestCol As Long
    Dim Threashold As Long
    Dim LastRow  As Long, LastCol As Long
    Dim t1 As Single, t2 As Single

    t1 = Timer()
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual

    TestCol = 9
    Threashold = 8

    Set ws = Sheet1
    With ws
        Set rng = .Range(.Cells(.Rows.Count, 1).End(xlUp), .Cells(1, .Columns.Count).End(xlToLeft))
    End With
    dat = rng.Value2
    ReDim NewDat(1 To UBound(dat, 1), 1 To UBound(dat, 2))

    LastRow = UBound(dat, 1)
    LastCol = UBound(dat, 2)

    NewI = 0
    For i = 1 To LastRow
        If dat(i, TestCol) > Threashold Then
            NewI = NewI + 1
            For j = 1 To LastCol
                NewDat(NewI, j) = dat(i, j)
            Next
        End If
    Next

    rng = NewDat

    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic

    t2 = Timer()
    MsgBox "deleted in " & t2 - t1 & "s"
End Sub

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...