Delete all the previous year files from the Site collection using PowerShell


 Sometimes it is possible that, when you are working with millions of files in your SharePoint site’s and you just don’t require the previous year files stored inside various document libraries and want to remove all the files from document libraries, then it is very time-consuming process when it comes to delete millions of files manually from different document libraries of the site collection.

In this article, we will focus on how to remove all the previous year files present in any document library in the site collection using PowerShell. It is possible that PowerShell may take time to complete the execution of the script if the site contains large number of files inside its document libraries.

We need to follow the below steps to create the script for the deletion of the previous year files. 

               1.  First, we need to add the “dll” files in our GAC folder and add the path in our script.


Add-Type -Path "C:\Program Files\Common Files\microsoft shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.dll"

Add-Type -Path "C:\Program Files\Common Files\microsoft shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.Runtime.dll"

 

        2.       Now, you need to connect to the site from where you want to delete the documents.

 

#Change the site url

$siteurl=https://contoso.sharepoint.com/sites/TestSite/

 

#Get Credentials to connect

$Cred = Get-Credential

 

#Connect to PnP Online

Connect-PnPOnline -URL $SiteURL -Interactive

 

#Get sites in SharePoint Online using PnP PowerShell

$Webs = Get-PnPWeb

             3.  Add the below function which will be used to fetch all the files from the mentioned folder name/path and remove the same whose “LastModified” Year is not equal to 2022.

# Function for fetching files based on document library or folder name

Function GetFiles($Path){

    try{

        #Get all files using document library url

        $Files = Get-PnPFolderItem -FolderSiteRelativeUrl $Path -ItemType File

        Foreach ($File in $Files)

        {

            #Remove file if the document is updated on or before 2021

            If($File.TimeLastModified.Year -ne 2022){

 

                #Uncomment the below line to remove all the prevoius year documents

                Remove-PnPFile -ServerRelativeUrl $File.ServerRelativeURL -Force -Recycle

                      Write-Host -f Red ("Deleted File: '{0}' at '{1}'" -f  $File.TimeLastModified.Year, $File.ServerRelativeURL)     

            }else{

                      Write-Host -f Green ("Present File: '{0}' at '{1}'" -f  $File.TimeLastModified.Year, $File.ServerRelativeURL)     

            }

        }  

    }

    catch {

        write-host "Error: $($_.Exception.Message)" -foregroundcolor Red

    } 

}


            4.  Add the below function which will be used to fetch all the files and folders inside the specific Folder.

 

#Function for getting folder and subfolders

Function GetFolders($folderUrl

{     

    try{

        $folderColl=Get-PnPFolderItem -FolderSiteRelativeUrl $folderUrl -ItemType Folder  

        # Loop through the folders 

        foreach($folder in $folderColl

        {                     

            $newFolderURL= $folderUrl+"/"+$folder.Name 

            If($folder.Name -ne "Forms" -and (-Not($folder.Name.StartsWith("_")))){

                GetFiles($newFolderURL);

                GetFolders($newFolderURL

                $libraryName = $folderUrl.Split("/")[1]

            

                #Remove empty folder

                if ($folder.ItemCount -eq 0 -and (($folder.Name -notmatch 'Document') -and ($folder.Name -notmatch $libraryName))) 

                { 

                    $ParentFolder = Get-PnPProperty -ClientObject $folder -Property ParentFolder

                    $ParentFolderURL =  $ParentFolder.ServerRelativeUrl.Substring($Webs.ServerRelativeUrl.Length)  

                    $ParentFolderURL $ParentFolderURL.Substring(1,($ParentFolderURL.Length-1))

                    Remove-PnPFolder -Name $folder.Name -Folder $ParentFolderURL -Force

                    Write-Host -f Green ("Deleted Folder: '{0}' at '{1}'" -f $folder.Name, $folder.ServerRelativeURL)

                }  

            }

        } 

    }

    catch {

        write-host "Error: $($_.Exception.Message)" -foregroundcolor Red

    }        

 

*Note: This function will not consider the folders with name “Forms” and other folders whose name starts with “_”. Also, this function will remove all the empty folders as well.

 

                 5.  Add the below function to iterate through every sites and subsites and get all the document libraries (The Document libraries which are not having “BaseTemplate” 101 and are hidden will not be considered here).

     Here the document libraries with name “Site Assets”, “Style Library” and “personalefiler” will not be considered here, as it contains some important files which needs to be preserved for the site. 

 If you have any important document library whose files needs to be preserved and then you can add the condition along with the conditions which are already added for the “Site Assets”, “Style Library” and “personalefiler” document libraries.

Try {

 

    #Iterate through each site

    ForEach($Web in $Webs)

    {

        #Get all document libraries - Exclude Hidden Libraries

        $DocumentLibraries = Get-PnPList | Where-Object {$_.BaseTemplate -eq 101 -and $_.Hidden -eq $false}

      

        #Iterate through each site

        ForEach($doc in $DocumentLibraries)

        {

            #Apply condition to avoid the deletion of document libraries

            If($doc.Title -ne "Site Assets" -and $doc.Title -ne "Style Library" -and $doc.Title -ne "personalefiler"){

                $Library = Get-PnPList -Identity $doc.Title -Includes RootFolder

                $RootFolder = $Library.RootFolder

 

                 #Get the site relative path of the Folder

                If($RootFolder.Context.web.ServerRelativeURL -eq "/")

                {

                    $FolderSiteRelativeURL = $RootFolder.ServerRelativeUrl

                }

                Else

                {      

                    $FolderSiteRelativeURL = $RootFolder.ServerRelativeUrl.Replace($RootFolder.Context.web.ServerRelativeURL,[string]::Empty)

                }

 

                GetFiles($FolderSiteRelativeURL); 

                GetFolders($FolderSiteRelativeURL);

            }

        }

 

        #Get subsites in SharePoint Online using PnP PowerShell

        $WebsCollection = Get-PnPSubWeb -Recurse

 

        #Iterate through each subsite

        ForEach($Subsite in $WebsCollection)

        {

            #Get Document Libraries Name, Default URL and Number of Items

            Write-host "Subsite" $Subsite.Url

 

            $FinalSubsiteurl = $Subsite.Url

            Connect-PnPOnline -Url $FinalSubsiteurl -Interactive

 

            #Get all document libraries - Exclude Hidden Libraries

            $DocumentLibrariesSubsites = Get-PnPList | Where-Object {$_.BaseTemplate -eq 101 -and $_.Hidden -eq $false}

       

             ForEach($subdoc in $DocumentLibrariesSubsites)

             {

              #Apply condition to avoid the deletion of document libraries

              If($subdoc.Title -ne "Site Assets" -and $subdoc.Title -ne "Style Library" -and $subdoc.Title -ne "personalefiler"){

                    $SubsiteLibrary = Get-PnPList -Identity $subdoc.Title -Includes RootFolder

                    $SubsiteRootFolder = $SubsiteLibrary.RootFolder

 

                    #Get the site relative path of the Folder

                    If($SubsiteRootFolder.Context.web.ServerRelativeURL -eq "/" -or [string]::IsNullOrEmpty($SubsiteRootFolder.Context.web.ServerRelativeURL))

                    {

                        $SubsiteFolderSiteRelativeURL = $SubsiteRootFolder.ServerRelativeUrl

                    }

                    Else

                    {      

                        $SubsiteFolderSiteRelativeURL = $SubsiteRootFolder.ServerRelativeUrl.Replace($SubsiteRootFolder.Context.web.ServerRelativeURL,[string]::Empty)

                    }

              

                    GetFiles($SubsiteFolderSiteRelativeURL);

                    GetFolders($SubsiteFolderSiteRelativeURL);

               }

            }

        }

    }

}

catch {

    write-host "Error: $($_.Exception.Message)" -foregroundcolor Red

}

 

    Below is the full script which you can execute in the PowerShell command after updating the “Site Collection Name”.

Add-Type -Path "C:\Program Files\Common Files\microsoft shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.dll"

Add-Type -Path "C:\Program Files\Common Files\microsoft shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.Runtime.dll"

 

#Change the site url

$siteurl="https://contoso.sharepoint.com/sites/TestSite/"

#Get Credentials to connect

$Cred = Get-Credential

 

#Connect to PnP Online

Connect-PnPOnline -URL $SiteURL -Interactive

 

#Get sites in SharePoint Online using PnP PowerShell

$Webs = Get-PnPWeb

 

#function for getting folder and subfolders

Function GetFolders($folderUrl

{     

    try{

        $folderColl=Get-PnPFolderItem -FolderSiteRelativeUrl $folderUrl -ItemType Folder  

        # Loop through the folders 

        foreach($folder in $folderColl

        {                     

            $newFolderURL= $folderUrl+"/"+$folder.Name 

            If($folder.Name -ne "Forms" -and (-Not($folder.Name.StartsWith("_")))){

                GetFiles($newFolderURL);

                GetFolders($newFolderURL

                $libraryName = $folderUrl.Split("/")[1]

            

                #Remove empty folder

                if ($folder.ItemCount -eq 0 -and (($folder.Name -notmatch 'Document') -and ($folder.Name -notmatch $libraryName))) 

                { 

                    $ParentFolder = Get-PnPProperty -ClientObject $folder -Property ParentFolder

                    $ParentFolderURL =  $ParentFolder.ServerRelativeUrl.Substring($Webs.ServerRelativeUrl.Length)  

                    $ParentFolderURL = $ParentFolderURL.Substring(1,($ParentFolderURL.Length-1))

                    Remove-PnPFolder -Name $folder.Name -Folder $ParentFolderURL -Force

                    Write-Host -f Green ("Deleted Folder: '{0}' at '{1}'" -f $folder.Name, $folder.ServerRelativeURL)

                }  

            }

        } 

    }

    catch {

        write-host "Error: $($_.Exception.Message)" -foregroundcolor Red

    }        

}

 

# function for fetching files based on document library or folder name

Function GetFiles($Path){

    try{

        #Get all files using document library url

        $Files = Get-PnPFolderItem -FolderSiteRelativeUrl $Path -ItemType File

        Foreach ($File in $Files)

        {

            #Remove file if the document is updated on or before 2021

            If($File.TimeLastModified.Year -ne 2022){

 

                #Uncomment the below line to remove all the prevoius year documents

                Remove-PnPFile -ServerRelativeUrl $File.ServerRelativeURL -Force -Recycle

                      Write-Host -f Red ("Deleted File: '{0}' at '{1}'" -f  $File.TimeLastModified.Year, $File.ServerRelativeURL)     

            }else{

                      Write-Host -f Green ("Present File: '{0}' at '{1}'" -f  $File.TimeLastModified.Year, $File.ServerRelativeURL)     

            }

        }  

    }

    catch {

        write-host "Error: $($_.Exception.Message)" -foregroundcolor Red

    } 

}

 

Try {

 

    #Iterate through each site

    ForEach($Web in $Webs)

    {

        #Get all document libraries - Exclude Hidden Libraries

        $DocumentLibraries = Get-PnPList | Where-Object {$_.BaseTemplate -eq 101 -and $_.Hidden -eq $false}

      

        #Iterate through each site

        ForEach($doc in $DocumentLibraries)

        {

            #Apply condition to avoid the deletion of document libraries

            If($doc.Title -ne "Site Assets" -and $doc.Title -ne "Style Library" -and $doc.Title -ne "personalefiler"){

                $Library = Get-PnPList -Identity $doc.Title -Includes RootFolder

                $RootFolder = $Library.RootFolder

 

                 #Get the site relative path of the Folder

                If($RootFolder.Context.web.ServerRelativeURL -eq "/")

                {

                    $FolderSiteRelativeURL = $RootFolder.ServerRelativeUrl

                }

                Else

                {      

                    $FolderSiteRelativeURL = $RootFolder.ServerRelativeUrl.Replace($RootFolder.Context.web.ServerRelativeURL,[string]::Empty)

                }

 

                GetFiles($FolderSiteRelativeURL); 

                GetFolders($FolderSiteRelativeURL);

            }

        }

 

        #Get subsites in SharePoint Online using PnP PowerShell

        $WebsCollection = Get-PnPSubWeb -Recurse

 

        #Iterate through each subsite

        ForEach($Subsite in $WebsCollection)

        {

            #Get Document Libraries Name, Default URL and Number of Items

            Write-host "Subsite" $Subsite.Url

 

            $FinalSubsiteurl = $Subsite.Url

            Connect-PnPOnline -Url $FinalSubsiteurl -Interactive

 

            #Get all document libraries - Exclude Hidden Libraries

            $DocumentLibrariesSubsites = Get-PnPList | Where-Object {$_.BaseTemplate -eq 101 -and $_.Hidden -eq $false}

       

             ForEach($subdoc in $DocumentLibrariesSubsites)

             {

              #Apply condition to avoid the deletion of document libraries

              If($subdoc.Title -ne "Site Assets" -and $subdoc.Title -ne "Style Library" -and $subdoc.Title -ne "personalefiler"){

                    $SubsiteLibrary = Get-PnPList -Identity $subdoc.Title -Includes RootFolder

                    $SubsiteRootFolder = $SubsiteLibrary.RootFolder

 

                    #Get the site relative path of the Folder

                    If($SubsiteRootFolder.Context.web.ServerRelativeURL -eq "/" -or [string]::IsNullOrEmpty($SubsiteRootFolder.Context.web.ServerRelativeURL))

                    {

                        $SubsiteFolderSiteRelativeURL = $SubsiteRootFolder.ServerRelativeUrl

                    }

                    Else

                    {      

                        $SubsiteFolderSiteRelativeURL = $SubsiteRootFolder.ServerRelativeUrl.Replace($SubsiteRootFolder.Context.web.ServerRelativeURL,[string]::Empty)

                    }

              

                    GetFiles($SubsiteFolderSiteRelativeURL);

                    GetFolders($SubsiteFolderSiteRelativeURL);

               }

            }

        }

    }

}

catch {

    write-host "Error: $($_.Exception.Message)" -foregroundcolor Red

}

                                                                                                               - Rekha Panwar                          

 

Comments

Popular posts from this blog

Introduction to Copilot in PowerApps

Still using Classic SharePoint Sites?

SharePoint Migration